Robo-journalism: software as sports writer

As some of you may or may not know, I freelance for the Phnom Penh Post sports section on occasion. So this story in today’s New York Times resonates for more reasons than one.

ONLY human writers can distill a heap of sports statistics into a compelling story. Or so we human writers like to think.

StatSheet, a Durham, N.C., company that serves up sports statistics in monster-size portions, thinks otherwise. The company, with nine employees, is working to endow software with the ability to turn game statistics into articles about college basketball games.

Now, no one is yet suggesting that such software-generated stories will begin appearing in your local newspaper anytime soon. The market for this stuff is believed to be smaller universities that want coverage of their sports programs but cannot afford it. So this robo-copy will likely — hopefully — never make it further than the school Web site.

But.

Such efficiency seems part of a larger, worldwide business trend that demands more of everything for a lot less money. In the newspaper model, that often means more wire stories and less editing, among other peculiarities. And automating sports stories — or any stories, for that matter — would certainly dovetail with the greater cost-cutting ideals that currently grip the industry.

Sure, even the best algorithm will never be able to cover breaking news, or write an editorial piece. At best, a computer will just manage to lash together a few statistics into a game brief. Or perhaps stack some economic numbers together for a business wrap. But to a publisher trying to stop the balance-sheet bleeding, that will one day look like column inches on the cheap. And the temptation will likely be far too much to withstand.

Tablets: a brave, flat new world

On the heels of overwhelming iPad success, everyone is now releasing a tablet. Samsung, Dell, Archos and Toshiba have all put products to market, and others will surely follow. Wired Magazine takes a look, but it’s the comments that are most informative (and entertaining).

Pirates of anonymity

The perils of assuming you are anonymous.

ACS: Law, a law firm based in Great Britain that tracks down alleged illegal file sharers for the porn industry, saw its database compromised over the weekend by members of the Internet forum 4chan. In addition to private e-mails and financial data belonging to the law firm, the names of people whom ACS: Law has accused of downloading unauthorized copies of porn movies were also revealed.

That sounds bad enough. But it gets worse.

The blog Torrentfreak reported that among the information posted to the Web were e-mails from people pleading for mercy and “married men who have been confronted with allegations of sharing gay porn.”

Unfortunate, no doubt. Here in Cambodia, such high-tech attempts at tracking down online pirates seem remote. Untoward political speech and affronts to culture still remain the Kingdom’s most offensive topics. A few crude attempts appear to have been made at limiting information in this vein. Though like many law enforcement efforts, that crackdown too proved short-lived and of questionable success. Real-world piracy — that is, the millions of bootleg $2 music and software disks available in every local market — is still a much bigger problem, and costs the country far, far more money.

Google’s new Facebook killer

In a direct assault on Facebook, Google has entered the social-networking wars with Google Buzz, a Gmail-integrated social-networking application. According to Google’s Todd Jackson, Buzz’s product manager, Buzz’s main features include:

  1. Auto-following
  2. Rich, fast sharing experience
  3. Public and private sharing
  4. Inbox integration
  5. Just the good stuff

According to a press release from Google:

The most noticeable advantage to Google Buzz is the way that e-mail comments and media, such as photos and videos, can be shared. Google Buzz automatically ‘follows’ the people who you communicate with most. Rather than broadcasting a passive “status message” like Facebook or “tweet” like Twitter, Google Buzz engages your friends by making the content that you find interesting available to them

Most of the buzz about Buzz centers around its real-time commenting features and its mobile integration, including voice recognition, which allows users to comment with voice only. No keyboard required! For developers, Google provides a Buzz API.

Not everyone, however, is enamored. And privacy issues have already been raised.

The Official Google blog has all the details.

How safe is Facebook?

Users in the United States were given access to the Facebook accounts of other people, reports the Associated Press.

“A Georgia mother and her two daughters logged onto Facebook from mobile phones last weekend and wound up in a startling place: strangers’ accounts with full access to troves of private information,” the story says.

The AP does not explain how the mix up happened, but the problem is not with Facebook, apparently. The glitch, “a routing problem,” occurred between the users’ phone and their Internet service provider, AT&T.

Security experts interviewed for the story said they had never heard of a case like this, where users were given access to the wrong account. It’s unknown whether such a mix up is rare, or just rarely reported. Experts agreed that the same flaw could happen with other applications, such as email or blogging services.

READ IT: Network Flaw Causes Scary Web Error

MORE: Ars Technica provides a not-too-technical explanation of what likely happened, including this pithy synopsis:

“So it looks like AT&T did something wrong—even though I wouldn’t call it a “routing” problem—and the company is in the process of fixing things. But Facebook also shares some blame for this situation. Apparently Facebook, like many other sites, doesn’t think the information tied to a user’s account is important enough to protect with something stronger than a clear text cookie.”

Government adopts Khmer Unicode

Cambodia’s main international airport first went digital in 2003. The new system multiplied exponentially the amount of time it took to get in or out of the country, as computer-unsavvy airport officials labored to understand the vagaries of Windows.

“We apologize for any delays that are caused by the use of our new computer system,” read little signs posted at each computer terminal.

They were still there two years later.

Even today, Cambodia remains in the very early stages of computer adoption. Most government ministries still keep hand-written records, and the exchange of data between agencies relies on an ad hoc system born of secondhand photocopiers and oil drums of ink. Decrepit phone lines and low computer-literacy rates add to the challenge.

The greatest hurdle of all has been the Cambodian language itself. For as beautiful as it is, there has been no standard way to display it. Until now.

In late December, the government passed a sub-decree requiring the Khmer Unicode font for all government correspondence.

In years past, the choice of typeface was left to the user, and as many as 30 different versions of “Khmer” competed for supremacy. There existed no uniform way to create the same characters across different fonts, which meant typists had to know them all, or stick to the few they did.

If documents arrived with an unknown character, too bad. Though font converters existed, few of them worked well, and translating from one font to another could take days or more.

Even more problematic, the lack of a font standard strangled the development of intra-government computer networks and centralized data storage. How could the government build a nationwide database of criminals, for example, when it could not even agree upon the font to use for data entry.

The move to Khmer Unicode fixes all that, and it provides the government a proper foundation on which to build a modern information system.

Apologies for any delay.