Oct 13 2008
Website Crawling
Make sure your site can be “crawled”
A site that can be crawled means the links to and within the web site can be discovered and followed by search engine spiders. Spiders or bots are programs that search engines send out to find and re-visit content, such as web pages, images, video, pdf files, etc. If a search engine spider cannot follow a link, then the destination page will either not be included at all, or exist in the search engine database but not be included in the search results.
The reasons a website is not indexed to a search engine may include:
• Navigation links embedded in Flash – the search engine Spiders usually do not crawl links within Flash files.
• Navigation links embedded in JavaScript or Ajax – Again, some spiders may have still issues crawling them
• Embedding site navigation links within forms - If the user has to select and item from a drop down menu or fill in a form field to see content, that content won’t be indexed.
• Lack of authoritative links into the web site. A lack of relevant links to the home page and interior pages of a site could cause the site not be crawled.
Possible solutions:
• Create navigation elements with search engine friendly CSS code that still offers dynamic functionality often found in Flash, JavaScript and Ajax
• Create alternative navigation with text links elsewhere on the web page, either in the footer and/or in within page navigation
• Create HTML site map pages made up of 100 or less text links to important pages on your site. You can have more than one sitemap page for sites larger than 100 pages
• Provide search engines with a XML site map list of all the URLs from your web site that you would like crawled.
Encourage inbound links from authoritative web sites to your home page as well as to important (and linkable) content within the site.
Leave a Reply
You must be logged in to post a comment.