Knight News Challenge

View the video from The Story & The Algorithm: 2012 MIT-Knight Civic Media Conference. Read and comment on the live blog created during the conference. Tweet with #civicmedia and #newschallenge.

The Knight News Challenge accelerates media innovation by funding breakthrough ideas in news and information. Winners receive a share of $5 million in funding and support from Knight’s network of influential peers and advisors to help advance their ideas.

Throughout 2012, innovators from all industries and countries are invited to participate in three challenge rounds, each with focused topics on emerging trends.

Challenge 1 - on NETWORKS - is closed, and the winners will be announced June 18.

Challenge 2 - on DATA - will be open May 31 - June 21. We’re looking for new ways of collecting, understanding, visualizing and helping the public use the large amounts of information generated each day. Winners will be announced in late September.

Details on Challenge 3 will available later this year.

Anyone, anywhere can apply for the challenge - whether for-profit start-ups or non-profit ventures. For more information on a variety of topics - from guidelines for for-profits, on intellectual property licensing, open source software and more - visit our FAQ.

Opening IRS Nonprofit Tax Returns

reblog reblog

1. What do you propose to do? [20 words]

Put 10 years of IRS Form 990, 990-PF, and 990T online in bulk, extract 75 million fields of data.

2. How will your project make data more useful? [50 words]

Today, nonprofit returns are not available in bulk on the net. And, if you buy the 2,000 DVDs, you get a bunch of unusable TIFF files. We’ll put the core returns into PDF documents and extract useful, computable data for a significant subset of the corpus.

3. How is your project different from what already exists? [30 words]

We have 5 years of the corpus already online, but nobody has extracted fields to make the data computable. Guidestar has a service, but it is in a walled garden.

4. Why will it work? [100 words]

We’ve shown we can put the core data online and have released that code. We’re experts at hosting large databases and our marketing efforts will be aimed at hard-core developers all over the country who will use this data. Our sub for the data extraction has technology that is in full production and has convinced hard-nosed investors and Public.Resource.Org that they are for real and are a going concern that can do their piece on schedule.

5. Who is working on it? [100 words]

Prime on this bid is Public.Resource.Org. We’ll do marketing, the core data build, and host the data. Our sub is Captricity.Com, an award-winning startup which will do the data extraction.  Captricity’s technology is based on CEO Dr. Kuang Chen’s Ph.D. in Computer Science from Berkeley.

6. What part of the project have you already built? [100 words]

We have processed 5 years of form 990s and released our code in open source at http://bulk.resource.org/irs.gov/eo2/ We have ordered the rest of the DVDs from the IRS. Captricity’s form processing and image recognition is in full production, though they are going to do some R&D to handle some of the custom aspects of this job.

7. How would you use News Challenge funds? [50 words]

$100,000 or $200,000 to Public.Resource.Org for marketing, developer relations, and core data, $400,000 to Captricity to extract 75 million fields in 12 months.

8. How would you sustain the project after the funding expires? [50 words]

We will seek additional funds from other foundations to finish the corpus and then browbeat the government to take the data and service over.

Requested amount: We’d like $600,000 but can do this for $500,000.

Expected number of months to complete project:
Total Project Cost: $600,000 or $500,000
Name: Carl Malamud
Twitter: @carlmalamud
Email address [optional]: carl@media.org
Organization: Public.Resource.Org
City: Sebastopol, CA
Country: US
How did you learn about the contest? The Internet

People Who Liked This Post

  1. Carl Malamud submitted this to newschallenge2