Tan Kin Lian's Blog: PDF centric website [2]

Saturday, August 14, 2010

PDF centric website [2]

I made a deliberate decision to pursue a document-centric websites. Some people, who are used to the traditional HTML design, disagree with my approach and argue that search engines cannot find the keywords in the documents. (Some of these critics make a living by designing website on the traditional HTML approach).

I am aware of this perceived disadvantage of the PDF design, but it is inconsequential. If a reader search for the word "insurance", he will get millions of links. Do you expect your HTML webpage with the word "insurance" to appear on the top page?

The webpages that appear at the top are from people who pays Google a few dollars per click. The buying rate could be less than 1% of clicks, so the cost of this type of advertising is very high.

If you search for "Tan Kin Lian", you will get to my website anyway. I do not expect people to come to my website through the keywords of "insurance", "investment", "property" or other generic words.

There are several advantages of the PDF centric website. It is easy for the reader to download a PDF containing the FAQs. The PDF can be read offline, printed or mailed to other people. This advantage far overwhelm the disadvantage of "missing the search engine".

I invite innovative enterprises to build the PDF centric website as a second website for your organisation. You do not have to dump the current website. Have two websites and see which attracts more visitors. You can read the PDF centric website here.

Tan Kin Lian

4 comments:

rex said...: Rex comments as follows,

Whether the information is coded into pdf (hence not serachable) or available directly on html pages and hence word-searchable is not really the point.

In my opinion, the main point is whether the information EXISTS or not in the company's webpage. So long as it exists, it is always possible to find it by doing a search either of the word, or using the title of the pdf document.

Therefore, in my opinion, the issue at hand is really WHETHER specific information is, or is not available to customers at all in the corporate websites.

E.g. suppose you want to find out whether upon death, your cpf money will be realised as cash or as cpf money to your beneficiaries, YOU CAN'T FIND THE ANSWER ON THE CPF website, even if you key in the search question. Therefore it makes no difference whether you use their website or tKL's website. The information just isn't there.

It's got nothing to do with the pdf or not pdf method. The method of presentation is not really so critical. The search either at title level (pdf documents) or word level (raw html) will yield results if the information is there. It the information is not there, there is no value-added to make any changes to the style of the website, or append to existing website.

rex; August 14, 2010 2:51 PM
Spur said...: Actually all mainstream document management systems do allow indexing and searching of PDF text. Whether you implement via duplicate hidden text file of the full document, or of the first page, or thru algorithmically selected keywords stored in metadata etc, it is already in practice.

I had been involved in some of the above 10 years ago already. Now the technology should be even more powerful, intelligent and faster.

Look at Google search engine. They are also able to search for words & phrases within PDF files.

There are widgets (both commercial & freeware) that people can incorporate into their websites for extra searching functionality for PDF, even got all sorts of extreme searching like for pattern matching, photographic data, fingerprint matching, voice & audio searching and matching etc etc. As long as you can turn it into bits & bytes, you can search it. My colleague who was involved in the first ERP gantries was also implementing face & license plate searching & recognition.

Just need to be aware of costs (including effort & time) and benefit. No point expending so much resources to come out with a super-duper search interface and capabilities, when user-friendliness and ease of use already there.; August 14, 2010 6:56 PM
dsowerg said...: I do not like PDF files because my browser will always download PDF files to my local drive. I can't find a way to disable this function. So I end up with a huge bunch of downloaded files and have to periodically delete them. It's quite annoying.; August 14, 2010 7:38 PM
AB said...: "I do not like PDF files because my browser will always download PDF files to my local drive".

I wonder which browser is that? Then you should use one of the more common browsers like Firefox, Chrome or IE, as they all are able to view pdf without having to download first.; August 15, 2010 8:42 PM