Sunday, December 11, 2011

Text Summarization Tools and Translation

Today I was (again) requested by one of my clients to translate (the news reports I do for them) in a way that they are summarized. So the readers may not have to go to unnecessary details which they are not interested in. I've tried to go around this request previously, because it was hard for me to provide extra time for "summarization" of the text, then translating it. Being a translator means you do not need to read the passage twice. You keep on reading sentence by sentence and keep on translating. It is very rare when the text is not familiar and translator has to read it first to get an overall idea of it. So I was being selfish in the same way, because I had to be give extra time for same work.

Well, having said that, today when I received the request again, I tried to find a work around. Where I could be happy as well as my client. And I got hit by the idea of automatic text summarization tools. That was amazing, I was so too much excited. I went to google and started searching "creating summary of text". And it gave me a bunch of tips every time "how to write executive summary" and bla bla bla... Well, that didn't go well, I said to myself. And then I changed the keywords to "text summarization tool". And I found some clues, an offline application and then some very exciting online tools which worked for me. So I used automatic summarization tools online today for the first time. I summarized a news report (in English) and then translated it. This way my client got happy (I am hoping so) and I am a happy guy also. So I wanted to share these tools here, in case someone might need to summarize his English text before translating it.

It is the best tool I've got today. It is based on an open source summarization tool. And I think a very good implementation. It works fine for longer texts as well as short. My experience with a 600 words or so text was that it summarized it almost half when I chose 50%. But for a 400 words text, it didn't shorten it to almost or exact half, but returned the same. Then I selected 30% (even 35 or 40 didn't work either) and it gave a very compact summary of the text. So I think next time I should get a one third of longer texts as well so to incorporate more news reports in my daily professional Urdu translation limit for that client.

It was pretty good too, but it is a demo version. Additionally it was not that good as the above one. So I left it.

This one was what I got at number one. It is actually written for Swedish but supports a number of languages including English. The tool is pretty much old and looks like not maintained anymore. It gives some good options for summarization as well, but its result is not very good. After getting the summary of longer text (same as used for text compactor) I had a feeling that automatic summary may not work for me. It focused too much on data and statistics, two sentences were subject less, and the ending part (which I think should have been part of the summary) was also omitted. So I had to add it on my own. But later text compactor solved the problem. I am hoping it will go on solving the problem for me. Because I cannot afford providing extra time for summary and then translation, or deciding what should be included and what should not be (in translation).

So folks, if you even get a chance when you need a summary. Do try these online tools, they work good.