Quantcast
Channel: Tech in Asia
Viewing all articles
Browse latest Browse all 49556

CocCoc, Another Big Vietnamese Search Engine is Here

$
0
0

On January 30th, CocCoc, a new Vietnamese search engine, arrived on the scene. It’s the second big homegrown Vietnamese search engine to appear in the past three months. And by big, I mean:

Well, including search raters and other staff, we have about 400 people. Five in marketing, four in accounting. You can safely assume that the rest is a technical team.

That’s Victor Lavrenko, CEO at CocCoc. The project, which started two years ago, has been in closed development until very recently. Since inception, the project has burned over US$15 million, with most of the 400 person staff working full time. The new search engine looks to take on the incumbent Google.com.vn and the competitor that came out last November.

Russian roubles

Victor is from Russia. So are another 40 engineers in the team. These engineers have had experience battling Google back home in Russia, getting up to even 20 to 25 percent of the search queries in Russia through native search engine, Nigma. So they felt confident enough to fully enter the Vietnamese market. The company is mainly funded by Russian venture capitalists like Digital Sky Technologies, who also invested in Facebook early on.

CocCoc is actually a spin-off from projects back home, Victor explains to TechinAsia:

It comes from an experimental search engine in Russia, nigma.ru. It gave us good experience and a good team. Second, we’re not just a startup with an idea or a weak prototype. We already have the engine and many experts, including current and former Google employees, and are quite enthusiastic about our search quality.

For example, the top Vietnamese guy at Google, Christopher Nguyen, thinks that in 92 percent of the navigational queries – when you search for a particular website – we’re the same or better than Google. Also, we have lots of very experienced individuals. About 10 people are from the leading Russian search engine Yandex that beat Google 3:1 by market share in Russia. The head of our web search team is the guy who created the first Russian search engine. It was later killed by an ISP that bought it out, so he did it for a second time with Mail.ru, and now it has about 10 percent market share – it’s not much, but quite comparable to Google’s 20 percent. And he had only $700,000, so the project was underfinanced, but he still got quite good results. I myself was a co-founder of Mail.ru, as a CTO, which is now a $7 billion company traded on the London Stock Exchange.

Other Russian roots can also be seen in the afore-mentioned other newcomer, Wada, which launched in November last year, and which is built on search technology from Ashmanov and Partners.

Search me for a reason why Russians are getting involved in Vietnamese search engines.

The Mechanics of CocCoc

CocCoc means “Knock Knock” in Vietnamese and signifies the team’s sensitivity to the Vietnamese language. Victor stresses that his new search engine “better understand Vietnamese linguistics, and rarely produce irrelevant results . At least that’s our goal.”

For the nerdy folks out there, I had to ask Victor what specifically makes CocCoc better than Google’s search mechanics. According to Victor, since Google crawlers are outside Vietnam, its links are weak. CocCoc has two billion pages indexed so far in-country, he claims, so its numbers are more up to date. But the Vietnamese language is the trickiest area, and it has proven to be a hard nut for Google to crack. Victor explains the linguistic minefield:

Well, it’s easy to explain the specifics using our name as an example. You may notice that it has a space within the word Ed: Though not when we type it! because Vietnamese words are written by syllables. They used Chinese characters before, and the principle is one Chinese character for one syllable. So usually words consist of two or more syllables. Even if a word is a one-syllable term, there is a so called “pairness tendency” in Vietnamese language – they will add a stop word or the syllable with the same meaning just to avoid telling the single syllable.

Another specific is diacritics. There are two dimensions of diacritics in Vietnamese. The first dimension – is a pronunciation type – e.g. “o” can be just “o” or ”ô” or ”ơ”. The second dimension is tones. It can be “o” or ”ó” or ”ỏ” or ”ò” or ”õ” or ”ọ”. So altogether we have 18 combinations for all the Vietnamese vowels.

Apparently, CocCoc is ready to tackle these linguistic issues much better than Google. Users don’t have to type out the diacritics but the search engine accounts for them. It’s also at the heart of Wada’s mission. Wada can also deal with diacritics better than Google. I wouldn’t be surprised if CocCoc could manage this, with such a high percentage of Vietnamese engineers on the task.

Either way, CocCoc is going to have to work very hard, even with its linguistic advantages, to defeat Google, since Google is already the most used website in Vietnam, according to Alexa. One thing it is working on is something called the “Pho Xa 360 feature”, which will be like a Google StreetView in Vietnam. A service that Google currently has not implemented in Vietnam.

The post CocCoc, Another Big Vietnamese Search Engine is Here appeared first on Tech in Asia.


Viewing all articles
Browse latest Browse all 49556

Trending Articles