The quality of entries in the world's largest open-access online encyclopedia, Wikipedia, depends on how authors collaborate, a study by University of Arizona researchers has found.

The research, they say, is the first to explain why some articles on the site are of much better quality than others.

Wikipedia has an internal quality rating system for entries, with featured articles at the top, followed by A, B, and C-level entries. The team randomly collected 400 articles at each quality level and applied a data provenance model they developed in an earlier paper.

"We used data mining techniques and identified various patterns of collaboration based on the provenance or, more specifically, who does what to Wikipedia articles," says Sudha Ram, professor at the University of Arizona Eller College of Management. "These collaboration patterns either help increase quality or are detrimental to data quality."

The study identified seven specific roles that Wikipedia contributors play. "We found that all-round contributors dominated the best-quality entries," said  Ram. Starters, for example, create sentences but seldom engage in other actions. Content justifiers create sentences and justify them with resources and links. Copy editors contribute primarily through modifying existing sentences. Some users – the all-round contributors – perform many different functions.

"We then clustered the articles based on these roles and examined the collaboration patterns within each cluster to see what kind of quality resulted," Ram said. "We found that all-round contributors dominated the best-quality entries. In the entries with the lowest quality, starters and casual contributors dominated."

To generate the best-quality entries, people in many different roles must collaborate. The authors suggest that the results of this study should spark the design of software tools that can help improve quality.

"A software tool could prompt contributors to justify their insertions by adding links," she said, "and down the line, other software tools could encourage specific role setting and collaboration patterns to improve overall quality."