How to Avoid the Google Sandbox

Discussion in 'Webmaster Articles' started by Cormac, Jun 13, 2007.

  1. Cormac

    Cormac New Member

    The Google Sandbox is the term given to the holding area which contains domains which Google raises a red flag against. When a domain is placed in the Sandbox it does not receive a ranking in the search engine, it’s content does not get crawled by Google’s spiders and the website’s indexed pages get placed into a supplemental index. Getting sandboxed is one of the nightmare occurrences for webmasters and online traders.

    If you have conducted and implemented proper keyword research in relation to your product then search engine referrals can account for up to 90% of your traffic. Invisible websites cash invisible cheques.

    Google can place a website in the sandbox is it conforms to any of the following criteria:
    • If the domain name is newly registered
    • If the domain or website is constantly changing either its IP or DNS address(s)
    • If your website links to or receives links from ‘bad neighbourhoods’ (such as other websites in the Sandbox or those with questionable content)
    • If your website is involved in a link farm or if it has used Black Hat tactics to achieve a higher than justified ranking in the search engines.
    • If you have abused 301 Permanent redirects.
    Some people believe that the Sandbox doesn’t exist. Matt Cutts, Google’s Chief of Web Spam, in a 2005 interview explained that Google’s search algorithm “might affect some sites, under some circumstances, in a way that a webmaster would perceive as being sandboxed.”


    How to tell if you have been Sandboxed

    The quickest way to determine if you have been Sandboxed is to check if your content is been indexed by the search engines. Open up Google.com and type the following command into the search bar – site:your-domain-name.com – I have demonstrated this in the screenshot below.

    [​IMG]

    The ‘site’ command runs a query on Google’s data centres and determines how many of your site’s pages are indexed. Only indexed pages within Google appear to the end user as results when they search. If your site isn’t indexed then it won’t appear.


    The image below is the corresponding result to the site query.

    [​IMG]
    It displays the amount of indexed pages in Google for the domain macblogger.net. As you can see, there are approximately 31 pages of this website indexed by Google. This indicates that the domain is not Sandboxed. If the domain were Sandboxed then there would be a grand total of zero results.


    How to avoid getting Sandboxed

    Websites which Google has trust in do not appear in the sandbox. So how do you gain trust?

    This can be a time consuming process. Not because it requires a lot of attention or man-hours by yourself to get un-boxed but rather because Google takes its time in apply trust to new websites. Patience is a virtue in getting un-boxed as it can take up to six months to make a reappearance in the search engine results (SERPS).

    In saying that, there are a few recommendations I can make to new site owners who hope to avoid getting Sandboxed in the first place.

    “The Do Not’s”

    Don’t use search engine auto submit tools.

    Google, and the other search engines, want things to occur naturally. Using auto
    submit tools to submit your site to 100’s of search engines is a sure way to draw
    unwanted attention to your website. This can be seen as mass spamming.

    I don’t recommend going to Google’s Add URL (Add your URL to Google) page either. By using this tool you are adding your website to a list of sites for Google to crawl. It’s best to avoid this list and instead focus on getting a handful of quality back links from relevant and trusted sites. This is the organic approach which Google prefers.


    Do not race into a link building strategy.

    Google loves links. It’s one of the major factors which can propel your website from search engine obscurity to the top of the rankings. If Google feels that your site is obtaining too many links too quickly then your site is a candidate for the Sandbox. Google wants to see organic linking from websites which have relevancy to one another. Very rarely does a brand new website receive 100’s of links instantly. Google may think you are involved in a Link Farm or that you are buying links in order to increase your back links.

    Build your links slowly and as naturally as possible. Choose wisely who you link to and who you ask to receive links from. If you are in doubt over linking to a website then ‘rel=nofollow’ the link to that website. The ‘nofollow’ attribute in links advises search engine spiders not to follow the link as you can not be certain the link is trustworthy.


    Respect 301 redirects.

    If your website has moved domain from old-domain.com to new-domain.com then a
    301 redirect is normally used to transfer the entire site’s content from the old domain to the new domain. This is sometimes abused when developers purchase a number of websites and attempt to redirect that site’s indexed content to the new domain.

    “The Do’s”

    Register your Domain for as long as possible.

    Most new websites register their domain name for a single year and renew it annually. If you register your domain for as long as what you can afford then you build trust with the search engines. By registering your domain name for five or more years you are giving a clear indication that you plan on been here for the long haul. Google became an official Domain Name Registrar in 2005 and is most definitely using the ICANN records within its search algorithm to a certain extent.

    Purchase an SSL Certificate and use it.

    SSL Certificates are used on web pages which require their content to be encrypted. Checkout pages for online shops and logins for websites generally use SSL to build trust with the end user. Search engines can respond in the same way. By securing a page on your website with an SSL Certificate you are further building trust with the search engines.

    Host your website on a Dedicated IP.

    Most small websites are hosted on shared servers. A shared sever can contain thousands of other websites. If some of these websites get blacklisted by Google or any other search engine then there is a possibility that your site might follow suit and join them on the list. This is because all of the website’s hosted on the shared server share an IP address with each other. It’s best to avoid any common denominators with blacklisted sites and the only way to do so is to host on a dedicated server. If you are serious about your company’s online presence then you should use dedicated hosting.

    Sit on your Site.

    Once you have registered your site you should put a few pages of content up there. These pages don’t have to be part of the final design of the site but they act as an indicator of the site’s future content. By placing content on the site you are providing search engine spiders with content to crawl. It is their job to crawl websites. If all you have is a logo with ‘coming soon’ text on your site, then the spider won’t care much to return to your site and won’t be in a hurry to return.

    Leave approximately four or five pages of content on your site. In a months time check to see if your site appears in the search engine results by running the command ‘site:my-domain-name.com’. If the spiders have indexed your site then you have successfully avoided the sandbox.

    Conclusion


    It is imperative that you avoid the Sandbox. If you abuse the system then the system will make you pay. Read up on the Google’s Guidelines for Webmasters (Webmaster Help Center) and avoid Black Hat SEO tactics. If you implement the advice I have provided then you should avoid having your website Sandboxed.
     
  2. mneylon

    mneylon Administrator Staff Member

    Thanks for the excellent article

    I don't agree with all your points, such as the one about the dedicated IP, but it's still a good article
     
  3. 3rigena

    3rigena New Member

    [deleted]
     
  4. Cormac

    Cormac New Member

    Thanks guys,
    3rigena, it's a bit of a grey area, but SSL is one of the quick fire ways to distinguish your site from a Spam site. The inclussion of SSL is one guy to gain trust with SE's in my opinion.

    Michele, the dedicated IP thing is a bit of an over step and somewhat paranoid approach to avoiding the Sandbox but if you look at the Netcraft risk rating, it usally dispalys a tiny red spec for sites hosted on shared servers but rarely does for sites hosted in dedicated IPs. It's all about trust and if I were Google I would trust a site on a dedicated server/ip more so than one on a shared server.
     
  5. mneylon

    mneylon Administrator Staff Member

    Google might be demented, but not stupid.

    Any bullet proof hosting plan would come with a dedicated IP. So does that mean you should trust spammers?
     
  6. paul

    paul Ninja

    I've heard that thing about not submitting to google. But I can't for the life of me see why it could cause you to get 'sandboxed'. Either way none of my domains have been put in the sandbox.

    Does anyone have an example of domain that is ?
     
  7. Cormac

    Cormac New Member

    Mmm, it looks like Cutts' agrees

    Not so much Google, but my blog dropped off MSN/Live completely for a month. That could just be down to those SE's being borked up though.
     
  8. mneylon

    mneylon Administrator Staff Member

    Damn straight

    I currently have most of my sites on one of two IPs and it has no obvious effect on their pagerank etc.,
     
  9. 3rigena

    3rigena New Member

    [deleted]
     
  10. mneylon

    mneylon Administrator Staff Member

    The problem with any of these theories is that they are just that - theories.

    Unless the search engine operators actually make a definitive statement (unlikely) then we are all playing a guessing game. It might be an educated guess :)
     
  11. Cormac

    Cormac New Member

    3rigena, I would imagine that all things being equal between two websites, the website which has a secured page will sooner avoid the sandbox than the one without. There are many factors involved and I believe SSL to be one of them, how much weight is applied to it is of course another thing.
     
  12. RedCardinal

    RedCardinal New Member

    Hi Cormac

    Hope you dont mind me pointing out one or two things:
    This is completely incorrect - if your pages aren't indexed they cannot be sandboxed. If you want to see if you are sandboxed the first thing to check is whether you are indexed. The site: operator is not affected by sandbox dampening. Then you need to try searching for unique strings contained in the page titles or text of those pages. Generally a site that is sandboxed wont rank for anything other than the most long tail queries, and perhaps the actual domain name.

    Your points about domain registration duration and SSL have been validated in Google patents. 10yr registration will help.

    The only true way to avoid the sandbox altogether is to get really great authority links pointed at the domain. As long as you have quality you dont need to worry about quantity (i.e. thousands of inbound links wont hurt at all if you have some well trusted authority links in the mix).

    Nice article though :)
     
  13. mneylon

    mneylon Administrator Staff Member

    Only with .com

    Google doesn't know how to track ccTLDs very well and is quite clueless when it comes to IE domains
     
  14. RedCardinal

    RedCardinal New Member

    Morning Michele
    God I do hope so.... :D

    They can, and do, track things like nameserver and IP data which is used to look for changes and irregularities. If your interested there is a great patent about search results based on historical data (good Lord above - I cant believe I'm calling a patent 'great'...). I'll dig out the URL if you want some quality bedtime reading material :)

    I'm running a little experiment at the minute to see how well they handle dropped .ie domains.
     
  15. mneylon

    mneylon Administrator Staff Member

    Well .. from my own experience they don't seem to track them properly at all

    I've got about a half dozen or so "pre-owned" IE domains
     
  16. Cormac

    Cormac New Member

    Isn't that what I said though?
     
  17. RedCardinal

    RedCardinal New Member

    Hmmm... I read that to say that if you perform a site: query and pages are returned then your site isn't sandboxed:
    If there is a grand total of zero results then your site isn't indexed and therefore cannot be sandboxed. Am I missing something here?
     
  18. Cormac

    Cormac New Member

    What I'm saying is:
    0 pages indexed = sandboxed
    many pages indexed = not sandboxed
     
  19. RedCardinal

    RedCardinal New Member

    That's what I thought Cormac.

    0 pages indexed = cannot be sandboxed because by definition if you are not indexed you cant be under a dampening affect.
    many pages indexed = might be sandboxed

    The sandbox is not applied to the site operator - the fact that results are returned using a site: query is the same regardless of whether the site is under a 'sandbox' effect.

    Sandbox does not remove pages from the index, it simply applies a dampening factor so that those pages will not rank above a certain threshold for competitive or semi-competitive search queries. You can only see sandbox effect using normal queries not the site: operator.
     
  20. Cormac

    Cormac New Member

    Ah, now I get you. I thought that if you were sandboxed you wouldn't rank thus meaning that your indexed pages were put on hold.

    Does your PR ranking get frozen (greyed out) when you're sandboxed though?
     

Share This Page