New Scientist on Forum-scanning Software

Ryan · Jun 23, 2006

I'm sure the PTB would find good use for such technology. They probably already have... ;)

Untangling the chat-room debate

New Scientist said:
Software that follows an online discussion and picks out the most relevant post and the most influential participant could provide an automated synopsis of chat-rooms debates and email chatter.

The software was developed by Eduard Hovy and colleagues at the Information Sciences Institute, University of Southern California, US. They say single out the key post or email from thousands of messages.

To start, they manually categorised the messages according to their purpose - identifying, for instance, requests for information, answers to such requests and social chit-chat.

Then they used a lexical database to look for similarities in the vocabulary of each message and find relationships between them. Another lexical analysis technique was used to measure the degree to which a message was useful to posters, based on the language used in replies.
Page ranking

To integrate these different analyses, the team modified an algorithm called Hypertext Induced Topic Selection (HITS), which is normally used to rank web pages according to the links between them. But, instead of using it to search for the web page most relevant to a particular query, they used the algorithm to find the most influential post in a conversational thread.

The dataset used by the team was threaded discussions between students on the USC undergraduate computer science course. The software was found to be 70% accurate at picking out the most relevant post, when compared to human analysis.

"I think people will want to try this technique to untangle threads in conversation records of all kinds, including message boards," Hovy says.
Rich structure

Jon Kleinberg, a computer scientist from Cornell University, New York, who developed the HITS algorithm, believes the approach has potential.

"It is a very nice application of link analysis," he told New Scientist. "Exploiting the fact that human conversations have a rich structure beyond the raw text they contain." Kleinberg, however, notes that the software is not yet fully automated, as the messaged have to be categorised.

Categorisation is "notoriously difficult", Hovy concedes. To do this, "one has to combine the number of responses, the types of responses and the authority levels of the responders all together. We've only begun to untangle the threads".

The research was presented at the Human Language Technology Conference 2006 in New York, US, in May 2006.

Mr. Premise · Jun 23, 2006

This is huge. Good find.

Now they can probably use the technology to automate the trolling process! No need to hire malcontents...

Lots of layoffs in store at Rendon.

Guest · Jun 23, 2006

The next step would be to automate the trolls themselves. Just have the program harass forums directly. It wouldn't be much of a change from what is up now, just a removal of the flesh variable. And bare programs can't be scratched. I wonder if the Rense construct is a prototype. After all, nobody has really seen the guy...

Mr. Premise · Jun 23, 2006

Yep, that's why they will be able to layoff all the posters.

Although they would have to write in the "react when scratched" response to make it realistic!

I think you are on to something about Rense. He probably started out as a real person, though.

EsoQuest said:
The next step would be to automate the trolls themselves. Just have the program harass forums directly. It wouldn't be much of a change from what is up now, just a removal of the flesh variable. And bare programs can't be scratched. I wonder if the Rense construct is a prototype. After all, nobody has really seen the guy...

New Scientist on Forum-scanning Software

Ryan

The Living Force

Mr. Premise

The Living Force

Guest

Guest

Mr. Premise

The Living Force