Custom Python scripts are much more customizable than Excel spreadsheets. This is good news for SEOs — this can lead to optimization opportunities and low-hanging fruit. One way you can use Python to uncover these opportunities is by pairing it with natural language processing. This way, you can match how your audience searches with your...READ POST
Internet security is a big problem, and it isn’t just for the IT staff anymore. It affects us as SEOs. Don’t believe me? Consider the incident reported at the end of last year by security research firm Sunbelt Software.
…criminals are now combining SEO tactics and booby-trapped Web pages, and doing it systematically. By posting tens of thousands of Web sites simultaneously, criminals can take over all the top spots on a search results page, casting a wide net that’s more likely to catch Web users. Eckelberry described these criminals as “SEO Gods,” saying they can “take any site and get it on the first page of Google results.”
Instead of wasting energy defacing sites and showing them off as trophies to their peers on IRC, hackers are now modifying the code of hacked sites to include (invisible) links to their web properties or link farms. The article talks about virus writers creating tens of thousands of websites and cross-linking them using all sorts of queries as anchor text. They then spam blog comments around the Web to improve the overall PageRank of the link farm.
Hackers already know how to break into sites. Now that they see the profit that can be made from top-ten search rankings, they have adapted their techniques to break to take advantage. Currently, search engines’ quality reviewers can detect most sites utilizing these black-hat techniques because they show up pretty obviously as SPAM. However, this is just the beginning, and I’m willing to predict that this is going to scale with cleverer hacks that are harder to detect. Most break-ins will be highly sophisticated and highly automated. They will “recruit” thousands of computers into their link-farm. If your site is one of those “recruited” without your knowledge, your site will most likely be penalized by the search engine along with the whole group.
How can somebody break into my server if they don’t know my password?
I remember my days working for a big ISP, setting up firewalls, installing the latest patches and hardening servers. It was a constant battle between the hackers and me (crackers is the correct term, but I will use hackers out of habit). One day one of the consultants the company hired to do penetration testing told me that I was not letting him “do his job.” He meant breaking into the servers of course; the only thing left for me to do, he said, was to disconnect the servers from the network. I couldn’t resist laughing out loud.
I had another boss once that would ask me simply to change the passwords each time our sites got hacked. He didn’t even want to buy a firewall, the most basic form of protection. Why changing the password is ineffective may be too obvious for those of you with some security background—but that’s clearly not everyone.
Hackers break into systems by exploiting software vulnerabilities. These vulnerabilities exist because most software is tested under “normal” circumstances. Software developers don’t usually expect users to provide input designed to fool the program into doing something it was not designed to do. But that’s exactly what hackers do using buffer overflows, string format attacks, script and SQL injection, default passwords, and other tools of the trade.
Protect your site from hackers now
You can protect your site or blog from such attacks, however. The first order of business is fairly straightforward:
Server hardening. Update all software, apply the latest security patches and disable all unneeded services.
Install a firewall.
Install an Internet security scanner and instruction detection, such as snort.org. Set it up to poll your site every day and address all issues that come up in the reports.
Unfortunately, some setups require a large number of software packages and keeping that list of components up to date can be quite a nightmare. The most common approach to deal with this is to use a multilayer approach—separate servers that do specific functions, such as a web server, database server, application server, etc. It is also common to host the blog, forum, chat rooms, and other elements on separate servers because each requires different applications and poses new security risks. The idea behind all this is to, at the very least, isolate the sensitive parts of your system, like your e-commerce components, customer list, and other delicate information.
Where SEO meets security
When you set up a blog or forum on a separate server, you still want to have it linked from the main site, typically using subdomains like forums.sitename.com or blog.sitename.com. The problem with this approach for SEO purposes is that search engines regularly treat each subdomain as a separate site when counting incoming links. The incoming link juice is therefore split among the domains. Google makes an exception only when displaying search results.
The single domain will benefit from higher rankings if links to the subdomains are funneled to the main one. Luckily, there is a technique to do this—reverse proxies. I have mentioned reverse proxies in the past and they are very useful beasts. In a nutshell, a reverse proxy sits in front of the web server, receives all requests, does some special processing (such as caching) and forwards the requests to the actual servers. A reverse proxy can be used to map URLs to different servers, and this feature comes in very handy for SEO.
We can use Apache’s mod_proxy for this. Here is a sample configuration
Allow from all
ProxyPass /blog http://blog.sitename.com
ProxyPassReverse /forum http://forum.sitename.com
Instead of directing users to blog.company.com, we write a reverse proxy rule to send requests for company.com/blog to the internal server blog.company.com. We can do the same for forums, chat, e-commerce systems, and so on. It is completely transparent to the user (and search engines) that the website is divided among multiple servers. Note that each web server will need to be isolated completely for the security to work. If someone breaks into the blog because the software hasn’t been updated, for instance, at least he won’t get to the e-commerce system.
Internet security is a very large (and interesting) topic. I will talk about it more in the future if there is enough interest. As usual, please share what you think in the comments.