Tag Archives: stats

Weibo and 星期 vs 礼拜

Just a note to myself that WEIBO IS AMAZING. After throwing up my hands at Twitter’s worthless search functionality* (Google’s discussion search is useful, but no holy grail), it is a pleasure to use something this intuitive, even if I have to re-translate the whole thing into my second language, I daresay it still makes more sense than Twitter does. I’m playing around right now with all sorts of things, including working with my friend on writing some simple code for searching for banned keywords. For instance, searching for “艾未未” (Ai Wei Wei) yields this hilariously transparent message:

根据相关法律法规和政策,搜索结果未予显示。热门微博推荐 (Rough translation: According to laws, legislation, and policies, the search results are not shown. We recommend blogging about popular things. [emphasis mine; literal translation of 热门.)

Pussyfooting around this Weibo is not.

So, interesting results so far?

1) As someone who grew up speaking Taishanese, laibai (pinyin: libai; 礼拜) was my word for week (eg, 礼拜一 for Monday) while sengkay (xingqi; 星期) was reserved for newscasters and certain older speakers. While speaking with my language partner, who is Taiwanese, she almost exclusively used 礼拜 as well. But as any student of Mandarin today, 星期 is the standard word and 礼拜 seems to have developed a religious connotation.** But for the most part, they are semantically equivalent, and thus, variations in usage appear to simply be either a) regional b) generational or c) context (informal or more formal). It’s sort of (emphasis on sort of) like the great American debate between soda versus pop, and with Weibo, you don’t have to actually design and tabulate a survey of who uses what where; the data is all already up online, coded by gender, age, and location.

It’ll take some time to scrape some of this data (no way in hell I’m going to sit here and do this by hand; but the sad thing is that it probably will take me just as long to figure out how to code the script to do what I want… sigh), but preliminary results:

礼拜天: 251748 results 星期天: 2461924
礼拜日: 48962 星期日: 1115494
礼拜一: 272460 星期一: 3436480
礼拜二: 78241 星期二: 1336238
礼拜三: 88890 星期三: 1287634
礼拜四: 91038 星期四: 1272936
礼拜五: 245327 星期五: 3157894
礼拜六: 253894 星期六: 3177002
礼拜七: 2664*** 星期七: 50031***

All right! And because the deputy likes dots, here it is in visual form:

So it’s official, on Weibo, Monday is the most popular day, closely followed by Saturday and Friday. Wednesday and Thursday are in a dead heat for least popularly cited. Curious what a similar chart would be on Twitter… oh wait, I can’t generate one. Dur. (Though I guess you could use Google to get a rough estimate, but those aren’t hard numbers like these on Weibo.)

Future project would be to do similar analysis of paired words like this, and to further dig into the data and figure out where these libai users come from and what similarities they share.

*What is it with Web 2.0 folks and broken search? That was aimed at you Tumblr, get your act together.


***Not a real date, but just curious to see if it’s used. I’ll have to go back and analyze what it actually means when people say Seven-day.