Flash SEO – introduction

Posted on April 16, 2009

2


Yesterday I got into a discussion regarding Flash and SEO. The old paradigm that “Flash is not searchable” is – and has been – a misconception. So it was time for me to write a blog post about this.

Why not?

It is actually strange that SEO for Flash sites has been obscure for so long.

Google and Adobe

Google and Adobe announced a new algorithm for indexing textual Flash content on June 20, 2008. See some basic Q&A from Google here. There is – however – no clear mention of what happens with dynamically loaded content. The challenge here is that while Google is spidering the Flash movie, XML data is not loaded yet and the spiderbot will have to wait for Flash events and custom made parsing actions before making a next move.

Will it be able to separate content from instructions in the XML? Will it be able to track and interpret implicit links to load other data: which do not look like links at all? And how will you be able to get directly to that part of content via a Google-page?

Apart from that: basic content in Flash is rather “unstructured”. Headers, body text and different text snippets can be placed anywhere without any direct relationship between the parts. And regarding relevance I quote the Google Q&A article: “If you prefer Google to ignore your less informative content, such as a “copyright” or “loading” message, consider replacing the text within an image, which will make it effectively invisible to us.” (What is that for a solution!) According to critics (like me and i.e. this one), the SEO result using this approach is very unreliable.

Regarding dynamic data and JavaScript (quote Google again):

  1. Googlebot does not execute some types of JavaScript. So if your web page loads a Flash file via JavaScript, Google may not be aware of that Flash file, in which case it will not be indexed.
  2. We currently do not attach content from external resources that are loaded by your Flash files. If your Flash file loads an HTML file, an XML file, another SWF file, etc., Google will separately index that resource, but it will not yet be considered to be part of the content in your Flash file.
  3. While we are able to index Flash in almost all of the languages found on the web, currently there are difficulties with Flash content written in bidirectional languages. Until this is fixed, we will be unable to index Hebrew language or Arabic language content from Flash files.

I personally do not believe in the Adobe/Google approach as stated in 2) and 3) for simple and complex sites as there are too many wildcards and something simple like “spidering a site structure” becomes overly complex as it is done indirectly via a medium that has completely different structural principles than HTML. I shiver at the implications for me as a coder when building data-driven sites. It will be like painting a wall while holding the ladder, the paint bucket and the brush at the same time in order not to fall over.

Another and more simple basis (with other wildcards)

In 2004, Peter Hall mentioned SEO on the Dutch “Flashtival” via the combined use of XHTML and Flash. Based on that I made some prototypes myself in 2004 qand 2005 and shared my findings on figleafs Flashcoders list (see quote on Jesse Wardens site. Links mentioned there are dead). It is quite unbelievable why Adobe and Google apparently did not also consider or mention the simple straight “Peter Hall” solution (1:1 private “Look how smart we are” Nerd fest?). The “Peter Hall approach” works like this:

  1. You publish your entire site-content as a normal website using XHTML pages
  2. Each XHTML page has an Include for the Main Flash Movie.
  3. When the XHTML page (“Page X”) is loaded, it loads the Main Flash Movie with a hard reference to itself
  4. The Main Flash Movie uses that reference and loads the “Page X” XHTML page as XML.

To make sure that the search engine only “sees” the (X)HTML, you use JavaScript and SWFobject. According to this article and the Google quote above the JavaScript/Flash will not be executed by the search engine Spiders.

Content cloaking?

From one perspective, what happens here might be considered “content cloaking

Now parties like Google are not fond of content cloaking and for a good reason. It has been eagerly abused in the past to get porn sites #1 ranking. Where the spider bot finds “1001 tips for the CEO”, the user will see “1001 hot sexy babes ready and willing to get laid”, a lot of pink and not one single tip for the CEO. In most cases the container with the SEO info was made invisible using JavaScript or CSS.

Apparently to be listed as cloaker, it requires people to report, so as long as the page represents the info presented in the search results this might not be an issue. I do not have any experience here, so cloaking might not be an issue/risk at all. (I appreciate any insights and feedback on this to deepen the value of these articles)

Alternative presentation?

From another point, you might consider presenting an (X)HTML page as Flash can be deemed as an “tech fallback” alternative presentation (see this by the W3)

Spidering

At the SWFaddress project I found another helpful piece about the SEO solution opted by Adobe and Google. It describes how to prevent search engines to search the Flash file (and prevent garble the SEO results).

Samples

Until now, the mentioned “Alternative Presentation” approach is working and apparently a site like Hilfiger Denim gets good search finds (March 2009) and as these search results prove: deeplinking within the site via search results works perfectly. To see how Google sees the site: disable your browser’s JavaScript. The page will be presented as HTML.

Check my 2005 site, using Flash (6.x), XHTML and the approach as mentioned above (excluding SWFobject). Check: home, Page 10, page 18 and also check the Page source. (It was a quick job between jobs so do not flame me for the dirty HTML / CSS job)

Other links

Read these or these original 5 pointers for SEO and flash

Advantages of the (X)HTML + Flash approach

  1. It is simple and straight forward.
  2. It follows the exact same rules as a normal website.
  3. The content is 100% readable for spiders and bots.
  4. Size does not matter. (Depth of the site, number of pages).
  5. Deeplinking leads straight to the referred page in the Flash-site.

Risks of the (X)HTML approach

The new approach from Adobe and Google, to search inside Flash movies apparently makes use of executing JavaScript and Flash. It is unclear what the next step from Google will be regarding interpreting web pages for search. So the trick as stated above might stop working at some point. Also assumed content cloaking can get a site banned from the search results.

Conclusion: be optimistic and do it

When introducing SEO strategies for Flash websites, we DO need to stay alert. As long as search engines do not offer a clear approach to SEO for “non HTML content” (my labelling) any approach induces the risk of becoming obsolete. However, for the short notice: be optimistic and simply do it.

Advertisements
Tagged:
Posted in: SEO