So I went to the Data.gov website to take a look. What I was afraid of is that we were duplicating work already accomplished. I figured that the easiest course would have been to take data from an agency and post it up to the Data.gov website. To my pleasant surprise, that is not what has happened. Data.gov is merely linking to data that is externally exposed on the agencys' own websites. This is a very good thing because the thing that made me most nervous about this is data quality.
If I work for USDA and I post data, and then post that same data on the Data.gov website, I could run into a problem. Inevitably someone will identify a data issue that I will work to correct. The problem avoided in the way Data.gov appears to be approaching this is that there is one and only one copy of the publicly available data. Meaning that if USDA makes a correction to the data, they don't have to send the corrected data to Data.gov for re-publication. Good.
I embarked on an initiative to assemble all of the data we had many years ago when I worked at HUD. My State and Local CPD Information site is still there, but probably not for long as HUD is refreshing their website. This was my big aggregation point for all data related to the Office of Community Planning and Development including:
- CDBG Accomplishments
- CDBG Disbursements
- CDBG Performance
- HOME Income Limits
- HOME Rent Limits
- HOME Performance Reports
- HOME Quarterly Performance Reports
- Homeless Grant Awards
- Formula Allocations
- Census Data important to HUD
I am not claiming to have created any of this data. Rather, my idea was to aggregate it in a manner that helped it to be more consumable by real people with real questions. As such, I created pages like this:
http://www.hud.gov/offices/cpd/about/local/pa/index.cfm
so that if you are wanting information about the state of Pennsylvania, it is all right here at your finger tips.
That is what Data.gov is doing. They want to be a place that can help people to aggregate this data. If there is an opportunity to grow, it is in the way the data is sliced. Put yourself in the shoes of a real person, perhaps someone in city or state government. This person has questions she wants to answer. If she is in Maine, she will not be interested in data from California. Someone has to think about the questions we are striving to answer. If we are just trying to feed the national-level researchers, then this set-up is great. But if we want to create something that can help to answer state or local questions, not so much.
No comments:
Post a Comment