title: What I’ve Learned As A Data Scientist In Washington
url: http://www.forbes.com/sites/kalevleetaru/2016/03/29/system-failure-what-ive-learned-as-a-data-scientist-in-washington/
hash_url: 0e530b2321
“Extraordinarily overqualified for government service.” That was the director’s summary that concluded my interview for one of the most prestigious US Government technology fellowships and served as my first introduction to the world of data in government. It is also a quote I have heard again and again as I have navigated the halls of government technology, data and innovation offices where creativity is stifled at every turn, “creative thinker” and “innovative” are spoken in hushed tones for fear of being overheard and where “innovation” is simply a buzzword to be tossed into press releases rather than a guiding principle driving how people think about the world. As I approach the three-year anniversary of when I first landed in Washington, DC inspired with White House talk of technology-driven innovation and with bold dreams of infusing government bureaucracy with fast-paced data-driven innovation-powered change, I’m struck by the polar opposites of innovation-fueled Silicon Valley and the stifling bureaucracy of Washington. Here’s what I’ve learned in three years as a data scientist in Washington.
Arriving in Washington just under three years ago I expected to be greeted by a bureaucracy in transition, with highly innovative and creative thinkers embedded across government with a mandate to fundamentally transform how government interacts with its citizens. Indeed, the Baltic nation of Estonia offers a case study in the power of technology to reimagine government, reinventing itself as an entirely innovation-driven government in which creativity, innovation, data and technology are the guiding principles behind every interaction of citizen and government. Every task from voting to purchasing a fishing license can be accomplished online through a smartphone while on the go – no waiting in line at a government office or hours of filling out reams of paperwork. In fact, “today one need appear in person at a government office only to finalize marriage or divorce and to purchase property – every other transaction between a citizen and the state is completed entirely online and can be done via smartphone from anywhere in the nation or even abroad.”
With this “digital revolution has come increased transparency and compliance, decreased waste and corruption, and an economic powerhouse that has rapidly built the nation into a digital-first workforce and stands testament to the power of data and technology in making government work for its citizens.”
As tax season is upon us here in the United States, it is worth noting that filing your taxes in Estonia requires just three mouse clicks and can be completed in less than five minutes, as the video below explains.
Yet, the experience I have had here in Washington could not be more different. When I arrived in 2013 I brought with me nearly 20 years of experience in the data-driven and online worlds, having founded my first web startup the year after Mosaic debuted while still in middle school and having spent 15 years in the high performance computing (supercomputing) world. I actually started as a high school intern at the National Center for Supercomputing Applications (the birthplace of Mosaic) with one of my very first assignments being to build what was at the time one of the largest web-scale web mining platforms at an academic institution while a senior in high school. Over the years I’ve worked in fields ranging from artificial intelligence to virtual reality with a focus on thinking outside of the box and reimagining how we think about information and society. Arriving to Washington I aimed to bring to the staid quarters of diplomacy and policymaking the Silicon Valley mentality of out-of-the-box and data-driven decision making. Instead, I’ve encountered a world openly hostile to data and innovation. Here are some of the many lessons I’ve learned from these experiences.
Aversion To Risk
As I wrote in a 2014 op-ed, the single biggest need for the US Government is to “learn how to innovate and innovating means learning how to fail.” In contrast, “Government favors ‘death by committee’ billion-dollar monoliths that spend more on planning than actual execution, cultivating a fear of failure. Silicon Valley biases towards action and ‘fails fast, early and often’ with small prototypes that rapidly prove (or disprove) an idea before much is invested: Failure becomes experience and its low cost cultivates innovation.” As the Washington Post so poignantly put it “A failed project in Washington is akin to a great tragedy — with managers being called to testify at congressional hearings and Government Accountability Office investigations being launched into why so much taxpayer money was wasted. But in the entrepreneurial world, say tech leaders, failure is regarded as a learning opportunity on the way to the next innovation.”
This fear of failure compels government to wrap itself in a comforting blanket of bureaucracy that stalls change and encourages managers not to embark upon projects that disrupt the status quo. A successful project will result in someone more senior taking credit, while a failed project could end a career. In today’s world the young Ensign Chester Nimitz who on only his second command ran his boat aground would likely have been discharged from the Navy rather than given a chance to learn from the experience to ultimately become CinCPac and Fleet Admiral and one of the US Navy’s foremost military officers who helped build the US submarine fleet. Today the slightest issue results in immediate dismissal and great theatrics and public shaming rather than a learning experience that builds better leaders confident in their decisions and willing to try new things.
Fitting its bureaucratic nature, government tends to construct one-size-fits-all approaches to risk. A government contractor designing and manufacturing novel weapon systems understandably might be required to carry a high level of insurance to guard against costly accidents. Yet, last year when I was contracted on behalf of a government agency to write a report summarizing data mining techniques for socio-cultural research and pilot the applicability of a pre-existing software package for such research, I was ordered to purchase more than 8 million dollars in insurance because I would be spending a single day presenting my work in a conference room in a government building and a previous contractor had had a minor accident while in a government building at some point in the past. Writing a brief report and experimenting with a data mining software program, or stepping foot into a government building, should not require 8 million dollars of insurance. Yet, government is unable to distinguish between high-risk research like weapons design versus report writing.
This one-size-fits-all approach to risk permeates to the deepest levels of paperwork required to interact with the government. When registering with the federal government as an individual data scientist and using the US Government’s form for sole proprietorships, I was still forced to fill out an entire set of questions regarding the operation of satellite launch facilities. I had to certify that I was not, among other things, “a foreign entity in which the government of a covered foreign country has an ownership interest that enables the government to affect satellite operations” and that I was not “offering commercial satellite services provided by a foreign entity that plans to or is expected to provide or use launch or other satellite services under the contract from a covered foreign country.”
While it would make sense to ask such questions of large defense contractors operating satellite launch facilities for the US Government, they seem slightly out of place on a sole proprietorship form for a data scientist. Common sense might suggest that there are likely very few private individuals in the United States registering as self-employed data scientists who own and operate their own private satellite launch facilities and are renting them out to the US Government.
The problem is that while all of these activities reduce the appearance of risk, they actually often increase true risk. As Alex Howard put it “the trouble is that all those [procurement] rules are selecting for huge companies that are very good at getting contracts not necessarily in delivering results.” In other cases, by mandating outdated and obsolete software, government is actually introducing severe cybersecurity dangers.
A Disregard For Data
Data is an unwelcome stranger in DC. Perhaps the greatest reason is its fundamental mismatch for the rigid top-down hierarchy of Washington. Proper data analysis requires suspending your own biases and beliefs and letting the data take you to its own conclusions. If those conclusions are contrary to your own expectations, you likely will run additional experiments and tests to confirm them, but if you disprove every conceivable alternative explanation, at a certain point, you have to listen to the data and use it as a starting point to understand why it is saying what it is. In Washington that is often tantamount to saying that a 21-year-old computer science graduate, fresh out of school and without a bit of policy experience, might have a better understanding of a situation than a 40-year veteran policymaker who graduated from a top foreign policy school, lived in the country in question for a decade and meets for lunch with its ambassador once a month. In such a rigid protocol-driven culture that values experience and connections, the ability of data to trump gut feeling can be a dangerous thing in the eyes of many policymakers.
One large government agency I spoke with said their previous analytics program had been scaled back and was presently in a holding pattern because its findings frequently contradicted the gut feelings of senior policymakers. At another agency I was told that a massive statistics collection program was being suspended because its conclusions suggested US policy in the region was failing, while another agency where I was helping to advise a project banned me from using US Government-collected survey data from the region because it suggested US policy was actually making the situation worse rather than better. In all of my conversations across government, when data and policy come into conflict, it is always data that loses. As one 25-year foreign policy veteran noted about the US Department of State, policy memos countering conventional wisdom or questioning senior policy decisions are buried from view for weeks or even months, with “bureaucratic chieftains us[ing] the process to stall, harass, and exhaust their opponents” until it is too late for the alternative views to be heard.
I once met the former director of one of the US intelligence agencies who told me “The problem with all this big data stuff is the guys bringing it to you are young kids in their 20’s and 30’s – I will never trust anything that comes from someone younger than 50 so data holds no value for me” and followed it with “If I have a question, I’ll meet with that country’s ambassador – I can learn more from looking into his eyes than any spreadsheet can ever tell me.” He is certainly not alone, with former President Bush similarly remarking after meeting President Putin in 2001 that “I looked the man in the eye. I found him to be very straight forward and trustworthy and we had a very good dialogue. I was able to get a sense of his soul.”
Gut feelings tend to trump data in the world of Washington politics in which facts can all too often come into conflict with political priorities. In fact, Congress has even banned the collection of certain politically sensitive data points. Comprehensive gun ownership data is more than 20 years old, while federally-funded research into gun violence is still rare, despite shifting Congressional views. Since 1995 the US Park Service no longer estimates the size of crowds on the National Mall after Louis Farrakhan threatened to sue the Park Service for what he claimed were underestimates of the number of attendees at an event he organized. Even the amount of US debt held by Saudi Arabia is withheld from public access, while online posts by one of the San Bernardino terrorists were missed US by immigration officials due to official government policy prohibiting the inclusion of online comments in immigration decisions for fear of “bad public relations.”
In some cases there are even allegations of outright fabrication and manipulation of data to better fit political narratives. In the case of the Defense Intelligence Agency, multiple investigations are examining whether senior supervisors rewrote intelligence assessments about the Islamic State to change their findings from stating that ISIS was being only modestly damaged by US airstrikes and that US policy was failing to counter the group’s rise to stating that they were being devastated and the US was winning its war on terrorism. Of course, the US would not be the first government to manipulate official data, with China recently launching an investigation of the head of its national statistics agency for “serious violations.”
Alleged fabrications are not the only source of data trouble in government – often wrong data comes down to simple error. Official government statistics are traditionally touted as authoritative data on everything from the economy to crime. Yet, the collection and compilation of such data is subject to all the same limitations and potential for error as any other data. This past winter Reagan Airport made national headlines when it significantly undercounted snowfall after it lost its official measuring device in the snow. Such data are often fed into complex computer algorithms and simulations through intricate workflows such that state-of-the-art computer models running on the world’s most powerful supercomputers are ultimately based on a guy walking outside in the snow with a stick and eyeballing how much snow has fallen on a special table.
Thinking Inside The Box
The bureaucracy of government is designed by its very nature to create glacial momentum that stifles change in order to ensure continuity across administrations. However, this momentum also suffocates technological evolution, ensuring government operates as an innovation backwaters across all of its branches.
As one House member put it, Congress is a “19th Century institution often using 20th Century technology to respond to 21st Century problems” in which sending out an email announcement in 2012 required “a couple days to program the computers to be able to handle a mass email.” In an era in which “likes” and clicks are reported in realtime, Congressional mass emails could not even record how many people actually received or viewed them. One staffer noted “coming to the House of Representatives from Silicon Valley is like going in a time machine.” Until June of last year House members were largely prohibited from using standard open source software like WordPress to build their websites because of interpretations that open software, by virtue of being free, violated Congressional gift bans.
It’s not that Congress doesn’t use technology – in fact it spends extensively on technology, pouring nearly a quarter billion dollars in 2014 alone into the IT systems supporting its members and their staff. Instead, the majority of this spending is restricted to a small number of preapproved vendors with often just a single vendor being authorized to provide services like high-volume mass emails. In the absence of robust competition vendors are rarely responsive to requests for new features or capabilities, since there is little competition, while government often pays vastly higher costs for software that frequently offers just a fraction of the capabilities of commercial off the shelf tools.
The byzantine contracting and approvals processes of the government has created an entire cottage industry of companies that provide software solely to government. Of the tools I’ve personally reviewed, most tend to be woefully out of date with industry standards and provide just a small fraction of the capabilities available from the commercial market while charging in some cases many orders of magnitude more. Interoperability is often nonexistent – I once saw a vendor claim it needed to replace all of the other software packages a government agency was using with its own equivalent tools “to ensure a seamless user experience.” While there is certainly merit in a suite of tools from a single vendor all sharing the same user interface, such justification is all too often used to cross sell a vendor’s offerings resulting in lock-in, higher costs and reduced capability.
The majority of the technology and innovation specialists I’ve met in my tours of government data offices, even those who originally came from outside government, tend to become locked into this narrow way of thinking. For example, when I met with the heads of data and innovation for a major data-rich federal agency, they lamented that government contracting and IT rules made it extraordinarily difficult to try and purchase their own cluster of servers to build a private cloud to offer their data for public download. After confirming with them that the data in question was entirely open data with no access restrictions, I suggested a short term fix – just upload copies of all of the datasets to GitHub. After all, that’s where the open source community congregates and many developers are already copying government datasets to GitHub in a myriad scattered forms, so having an official government repository for the data would bring the data to the developers and alleviate the need for government to build its own cloud to host it, at least in the short term.
Despite having some of government’s best and brightest in the room, there was stunned silence and after a few moments, one of senior people spoke up and in a quiet voice said that was one of the most brilliant, simple and creative approaches he had ever heard. I pointed out that actually I didn’t come up with the idea, it was already common practice – GitHub is already widely used by more than 600 government agencies around the world to publish their data and code, including the White House, so the processes and approvals are already in place for US agencies to make use of it. The problem is that while everyone in that room was very familiar with GitHub and likely used it extensively in their work, they had gotten so caught up in the government mentality and approach to IT that they saw their only approach being to build their own private cloud rather than leverage the industry-standard public cloud. Once in government even the brightest of innovators tend to forget that if you want people to use your data, you should bring it to them, rather than making them use a cumbersome outdated and brittle government data delivery service just to download a CSV file.
Indeed, few government-created data repositories can begin to compete with the ease-of-use and robustness of widely available systems like GitHub and its peers. In one case I was asked to make a one-paragraph placeholder abstract of one of my datasets available in a government data repository. It took six weeks, three organizations and an entire team of experts from both government and contractor just to upload a single paragraph of text to what was touted to me at the start by one of its maintainers as one of the world’s most advanced data repositories.
When I met for coffee with one of the most senior technology officials in the US Government and explained to him my frustrations, he said simply “I’m not surprised at all and really if you’re a creative thinker or highly innovative, government is not going to be a fit for you.” He continued this with “government innovation is about baby steps, effecting the most minuscule of change that advances the needle but doesn’t disrupt or upset anyone. We look for very narrow people that we can drop into a slot – right now we need someone who knows how to build HIPAA-compliant user input forms for a particular version of mobile device for our VA hospitals. We don’t want someone who is going to come in with bold ideas for transforming and fixing VA, we want someone who is just going to sit at their desk and write the particular form we ask for. Someone who gets a spreadsheet of tasks each morning and completes them each evening.”
Indeed, in a meeting with the data leads of another data-rich government agency, I was told they had no need to be innovative because government has no competition, people simply have to accept the services they provide as-is. Their job was simply to keep the lights on until the next administration arrives. One put it quite simply: “As a senior manager if I sit quietly and keep the legacy systems running as-is, I’ll retire with a nice pension. If I go and hire some innovative people to replace those systems with something modern and something goes wrong, I’ll get fired and publicly shamed. If it works and people are happy, my boss with take credit. Show me where the incentive is for me to do anything creative.”
Making matters worse, in government, much like the private sector, status is conferred through budget and headcount. Yet, unlike industry where many companies balance budgets against profits, government has the perverse duality of unlimited “free” money and few metrics to evaluate how “well” it is serving its citizens. When I prepared a report for one major government agency showing how a few widely used software tools could eliminate the 70-80% of the time its employees spent on rote data collection and preparation tasks and allow those employees to spend their time analyzing data instead of collecting it, there were gasps in the room as if someone had died. I was told that managers are promoted based on the number of people they oversee and that any technology solutions that reduced headcount would be ferociously fought against. When I noted that in this case technology would simply allow the existing employees to spend their time on more stimulating tasks and would actually likely improve retention, it was still viewed through the lens of headcount reduction instead of making employees and government more efficient and cost effective for the nation’s citizens. Moreover, in the contractor world, in which contracts can pay by the hour, efficiency is often viewed as a negative.
This focus on headcount over efficiency is one of the reasons that government struggles to provide the same level of service as private industry. One official government document sent to me with instructions on how to file certain required paperwork included this line: “As part of the Commissioner’s commitment to technological advancement and quality customer service, businesses are encouraged to submit their itemized lists on disk or CD.” Thinking the document was merely an outdated form that had accidently been sent I was amazed to see at the bottom “Last updated: October 2015.” The very existence of today’s companies like Uber are prefaced on their ability to have total realtime visibility into their entire worldwide organizations at sub-second resolution. It is hard for government to match that kind of capability when they are asking citizens to mail them floppy diskettes through the post office.
This glacial pace pervades all areas of government, meaning that not only does government fail to serve its citizens in the most cost-effective and efficient ways possible, but the national security of the country suffers as the defense community is unable to keep pace with states which threaten national interests. As an editorial in the Wall Street Journal this past January put so succinctly “The Pentagon has fallen behind the pace of commercial innovation. During the Cold War, the defense industry created GPS and pioneered computer networking. Today defense companies scramble to learn from businesses that have developed self-driving cars, 3-D printers, biometric scanners and microsatellites. Advances in computer processing and cloud management have been largely driven by commercial firms and university partnerships.”
One of the reasons for this is that the government struggles severely to attract top talent given that the salaries and benefits it is able to offer cannot begin to compete with the private sector. As a result, a common thread I’ve noticed in my tours of government is just how pervasive political appointees and former campaign personnel are in government data and innovation positions, even when the person has no expertise or experience of any kind in the area they are working. While there certainly are bright, talented and driven data people in government, their contributions are all-too-frequently rendered moot by the layers of bureaucracy that stifle and stall their work and most leave government quickly. Work schedules tend to be dictated by government regulations and customs, such as no work afterhours or on weekends even during emergency rollouts, rather than Silicon Valley’s mantra of working around the clock until the job is done. The focus becomes on process rather than outcome.
Innovation eventually makes its way into government, but often long after the original proposal and in piecemeal fashion. As but two minor examples, after presenting a blueprint last July for countering Islamic State propaganda using a counterintelligence mindset I met with a number of people across government involved in CVE efforts, with each arguing either that current anti-ISIS efforts were on the verge of eliminating the group’s threat or citing various reasons why it would be impossible to mount a more aggressive campaign. Few that I met with had any expertise or background with counter-terrorism, Arabic or Islam, Iraq and Syria or communications campaigns. A number were political appointees shuffling through various positions and roles, while some were simply fresh college graduates who thought their experience using Twitter in college prepared them as global experts on countering worldwide terroristic communication. One prominent group proudly told me they had just started earlier that month counting how many times their tweets were retweeted as their primary measure of global “influence.” Finally, earlier this month the Obama administration announced a new program that mirrors the playbook I outlined nearly a year ago, yet that year long delay has allowed ISIS to entrench itself and adapt. Similarly, I heard from a number of policymakers who argued the “cyber first strike” approach I laid out last November could never happen, even while Russia apparently was already deep in the midst of its attack on Ukraine’s infrastructure using a nearly identical approach.
This creates a dangerous complacency where an endless stream of press releases and announcements tout a myriad government innovation programs and that government is hiring the nation’s best and brightest, while “big data,” “data-driven” and “innovation” are laced through the titles of every conceivable initiative. Everyone considers themselves to be data experts and I routinely read biographies of both government employees and contractors along the lines of “foremost expert in the world on big data” yet when I sit down with them and sketch out a flurry of ideas in a few minutes to revamp what they’re doing, you get blank expressions when it comes to the most basic of industry standard techniques and technologies, let alone more advanced approaches.
In fact, I had the head of data sciences for one of the government’s largest contractors tell me that my own work was not useful to his organization “because it doesn’t fit into a single Excel spreadsheet.” Another very senior policymaker told me he saw no use for data mining or media monitoring because he watches CNN in his office that tells him absolutely everything happening in the world each moment.
While the US Government continues to generate a flood of announcements touting its focus on innovation and data expertise this actually ends up being harmful to innovation in that it creates a false sense of change rather than acknowledging the limited impact and narrow thinking of current approaches. As the saying goes, if the government was truly implementing change and driven by innovation and technology it wouldn’t need to keep repeating on a daily basis that it is being innovation-minded.
The Contractor Ferris Wheel
Given how difficult it is for government to offer the competitive salaries necessary to hire its own technical experts, much of the technical expertise available to government comes from a legion of private contractors. Stay in Washington long enough and you become intimately familiar with the great Ferris Wheel that is the government-contractor relationship, with people rotating from government to contractor to government to contractor as administrations change or new funding opportunities arise. Sit in any government meeting and at least a few of the people in the room are either currently working for a contractor or have in the past, while in some cases nearly the entire room are employees of major contractors.
One of the challenges is that these contractors frequently serve as advisors to government in technical matters in which those they advise lack the necessary technical skill to fully understand the advice they are receiving or in which government procurement rules require them to select a solution they know is inadequate but is the only option due to preapprovals or other certifications.
One particularly striking meeting early in my career in Washington vividly introduced me to this world of contracting. The purpose of the meeting was for me to brief a set of senior decision makers on the state of “big data” and data analytics for understanding the non-Western world, from available datasets to methodology nuances. As everyone went around the room introducing themselves, I was surprised to learn that just one person in the entire room was a government employee – every other person was employed by one of the major contracting agencies as a technical adviser to government. As my briefing got underway each person in the room forcefully recommended software and approaches that just coincidently happened to align perfectly with the software and tools sold by their respective company. Instead of objective independent technical advice, each adviser in the room was effectively a salesperson paid by the US Government to pitch their product.
In theory, those technical advisors are completely neutral experts, but in practice I have yet to meet an advisor that suggested tools or approaches other than those coincidently sold by their own company. The majority of the companies whose contractors were in this particular room sold software that worked only on English-language Western content, so they all downplayed the importance of using local content in local languages to understand the non-Western world, with one going as far as to suggest that anything of importance is published in English, coincidently in the handful of data sources that his company happened to monitor as one of their government offerings. Just two companies in the room sold tools designed for languages other than English and both agreed with the criticality of monitoring non-English material, yet they processed only major European languages, so argued at length that the ability to read Arabic was entirely unnecessary to understand local narratives in the Middle East.
One of the companies had recently deployed a new software package for keyword searching tweets and pitched it as being “99% representative” of local populations throughout the world. When I pointed to Twitter’s limited geographic and demographic reach and noted that large portions of the population in key areas of interest are not on Twitter, the contractor responded that they had proprietary techniques that eliminated these biases. I noted that no amount of extrapolation could make up for the kind of massive systematic bias in social media’s penetration into key communities, yet the company kept repeating that it had confidential techniques it was unable to say more about that fixed all of these limitations in Twitter and rendered it a nearly perfect view onto society. Despite my protests, the government representative seemed swayed by these arguments and agreed to deploy their solution in a series of upcoming production projects, rather than pilot it in a series of controlled experiments to test their claimed extrapolation procedures as I had suggested.
Focus On Process Not Outcomes
One of the first lessons of the Washington data world is the emphasis on process over outcomes. Meetings tend to focus on specifying the entire analytic workflow or application from beginning to end, detailing every piece of software, file format and version and every possible feature and design looking out over the next several years. I’ve actually heard government IT designers talk about designing software with all of the features their users will need for the next 25 years.
Such mentality is partially driven by the long government acquisition process that is more attuned to billion-dollar weapons systems that must last decades in service as opposed to the breakneck speed of software evolution in which new features are delivered daily and sometimes even hourly. In particular, modern software design tends to be highly modular and iterative, often with extensive A/B testing and multiple rapid prototypes that guide further development, compared with the monolithic up front design of government IT projects.
Yet even with modular government data analysis projects, the mentality of specification and detail often overshadows the focus on what the software or analysis is supposed to achieve. I once sat through an hour-long briefing on a new project that included intricate flowcharts showing how every module, library and application was connected and the precise version number of every piece of software and file format, yet by the end of the talk I had absolutely no idea what the software was actually designed to do.
As a whole, the government data meetings I’ve attended often seem to start with a set of tools to be used and then eventually get around to what questions might be asked. The focus is on creating a highly documented intricate process rather than answering a question that may or may not need a complex workflow to answer or which may need to evolve its methodologies as the problem is better understood. Government loves acronyms and code names and sitting in a government data meeting can sometimes seem like listening to a foreign language. Long elaborate titles are given to the simplest of tasks, making a Twitter keyword search sound like advanced data mining. Government is not alone in its love of process, but the lack of comprehensive evaluation of government programs means that process tends to trump outcome.
More to the point, government all too frequently views risk through the lens of process, focusing on mitigation of execution risk rather than the risk that the project will not produce the desired outcomes. One official described to me that when it comes to software and data projects, risk for him focuses on whether all funds are expended on schedule and in accordance with the contract with the proper documentation, and whether deliverables match the checkboxes on the contract, not whether the delivered software or analysis actually solves the need or answers the questions for which it was intended. By redefining risk, government is able to claim success as long as the money is spent properly, rather than truly evaluating whether the solutions it is buying or building actually meet its needs.
Failure Of Evaluation
Perhaps the greatest failure of government when it comes to data science is the lack of evaluation. In Washington there is no such thing as failure. Even the most catastrophic of outcomes are spun to sound like successes and the lack of external evaluation and oversight means the public rarely sees just how dysfunctional and failed the US Government’s use of data really is. Government secrecy is commonly used to mask poor project outcomes, with project evaluations and precise data points being restricted as sensitive information.
In government, when programs go south the public rarely gets a full picture of what went wrong or even that the project was not successful. When DHS canceled a national air sampling program that had cost more than $1.1 billion, instead of acknowledging the program had failed, it described the cancelation in the vague governmental language “cost-effective acquisition without compromising our security.” When the TSA last year failed 67 out of 70 tests in which contraband was carried through security, it responded only that “the numbers in these reports never look good out of context, but they are a critical element in the continual evolution of our aviation security” while the House Oversight committee noted that despite spending more than half a billion dollars on increased technology over the last half decade, performance levels actually decreased.
Other times the government is more forthcoming, but only after wasting hundreds of millions of taxpayer dollars. The FBI’s Virtual Case File system burned through $170 million before being declared “incomplete, inadequate and so poorly designed that it would be essentially unusable” while National Institutes of Health spent $350 million on its CaBIG cancer network that has “produced limited traction … and [is] financially untenable.”
Even when acknowledging failure internally, government often resists public acknowledgement or actually making change. When the Army’s $2.3 billion DCGS-A platform was determined by its own evaluators to have “significant limitations, not suitable, and not survivable” and was recommended to be replaced by commercial off-the-shelf software, the senior leadership allegedly promptly rescinded the recommendation and allegedly withheld all copies of the recommendation.
In fact, as one senior official once told me, there was stunned silence in his office when a junior official opened up their office’s weekly meeting by stating that the project he was overseeing was not succeeding and he was going to be making changes to it. The notion that an official might speak openly and candidly, even in an internal meeting, about a project’s lack of success was something absolutely unexpected and shocking. In many ways this summarizes what’s wrong with government evaluation today.
Perhaps the greatest problem is that the US Government simply has too few objective independent experts with the technical experience and expertise to properly evaluate its programs and offer sound advice to decision makers when it comes to the world of data.
I have personally attended countless briefings where contractors have made wild claims regarding the accuracy of their systems that are orders of magnitude beyond the best systems known in the commercial and academic worlds. I’ve heard everything from a contractor claiming 100% accuracy at machine translation of Arabic (which eludes even the best work from Google, Microsoft and Facebook), to numerous contractors claiming flawless accuracy at sentiment mining of social media, to countless contractors claiming to perfectly capture societal views from every corner of the earth through social media.
Early in my time in Washington a contractor claimed their software package had 99% accuracy at extracting complex factual information from open text. When I asked for a live demo I was told the company was unable to offer demonstrations “in order to protect certain proprietary aspects of its software.” When I suggested it be tested on a particular human tagged evaluation corpus that is widely used in the machine learning community, I was again told that such evaluations would reveal proprietary aspects of the software. Instead, I was presented with an evaluation the company itself had performed using its own staff that touted the 99% accuracy figure, but the evaluation was essentially a piece of marketing literature without any specifics as to how the software was tested, the kind of input text, etc.
In another early meeting I spoke at an event alongside a senior government official overseeing key data analysis programs for the US Government. When he touted accuracy figures well in excess of any known algorithms or methodologies, I asked how he was calculating these unbelievably low false positive and false negative rates, to which he replied that the methodology for computing the accuracy rate was deemed FOUO (unclassified but sensitive and restricted from public release). However, he did note in an offhand remark that “cases the evaluation team deem after the fact to have been difficult for a machine to identify are excluded from computed false positive and negative rates in order to better represent what we think is a fairer evaluation of the algorithms.” When I pointed out that false positive/negative rates must reflect the actual output of your algorithm, not a manually compiled and cherry picked subset of cases, he replied that that was how his office had always evaluated its data analytics programs.
Indeed, he is far from alone. Despite the Pentagon spending billions of dollars on propaganda campaigns, the Government Accountability Office “found the programs were inadequately tracked, their impact was unclear and the military didn’t know if it was targeting the right people with its propaganda leaflets, website and broadcasts” and were “coming under criticism even within the Pentagon for spending hundreds of millions for poorly monitored marketing campaigns in Iraq and Afghanistan.” Similar to my own experiences “after a drop of leaflets earlier this year in Syria, military officials say they could not provide measures of the effectiveness of the effort, saying it was classified information.” Indeed, a common refrain when attempting to evaluate government programs is that such data points are classified or otherwise restricted.
Just last year I saw an email go by from a major government program touting the enormous successes of its program to fund data science research in a particular field. The major successes it listed were that the majority of its grantees had not received funding from that office before, that much of the funding had gone towards graduate student salaries and that the funding had generated mentions in the press and academic email listservs. When I emailed back to ask for actual outcomes of the program in terms of major published research or changes to government programs in the area, the answer I got back simply pointed again to the original bullet points. In short, the funding program was held up as a major success because it managed to give away all of the funds and the program received some modest press attention. The notion of whether the program actually resulted in notable research or publications or actually changed how government functioned were entirely inconsequential.
In academia when you publish a paper in a reputable journal, there is a peer review process where other experts in the field review your paper and check that the data, methodology and findings appear sound and reasonable. Caveats and buried footnotes are rarely allowed, nor are cherry picking or redefining how key metrics like false positive/negative rates are reported. Government data analyses I tend to see, on the other hand, tend to be rife with such caveats buried deep within the resulting reports. One influential report I saw late last year claimed to analyze views across the Middle East and was filled with graphs and charts and various statistical significance tests. Yet, buried in the endnotes on the last page in very small print was a disclaimer that the report had analyzed just 5-10 articles a day total despite monitoring 20 countries and that the majority of the articles selected were in English. The report also divided articles into numerous complex narrative themes without listing the keywords used to define each theme and made use of numerous visualization methodologies like graph community finding without specifying the algorithms used or their seed parameters.
Within the intelligence community in particular there has been a narrow minded focus on collecting data rather than analyzing it. This has led to a world in which government can assemble petabytes of information from every corner of the earth, yet misses the most critical of connections because it lacks the ability to pull all of this data together into a holistic view of the world. Indeed, I would say that the vast majority of the data programs I have come across in my three years in Washington have focused on collecting or publishing data rather than actually using data to improve governance or understanding. Government agencies I’ve met with often focus on how to make their data available to the public, which is a fantastic goal, but rarely follow that up with how they could use their own data to make their services more efficient, streamline their interactions with the public, or even gain new knowledge about the functioning of the nation.
Those projects I have come across tend to be of the most rudimentary kind, sometimes just a formula in an Excel file updated by hand each month while even the most state-of-the-art applications I’ve seen pale in comparison to the kinds of analyses done in the commercial world. The people conducting these analyses often have senior titles and biographies claiming extensive experience with the most sophisticated techniques and touting their bleeding-edge expertise, but when I sit down to offer more efficient workflows or to suggest better algorithms or methodologies more suited for the questions they are asking, I frequently get blank stares and in more cases than I can count it turns out the workflow they are using is simply an open source script they downloaded from the web and they have no understanding of how it works, the nuances of the algorithms it uses or even whether it yields accurate results on their particular data.
It is striking to me just how different many of the government and contractor data scientists I’ve met in Washington are compared with those I’ve met in Silicon Valley. Three weeks ago I spoke at Structure Data 2016 in San Francisco on my work exploring societal scale data mining and had the distinct honor of talking with the data leads of a cross section of today’s marquee companies. Each had an incredibly deep understanding of the current state of the field, the technologies and methodologies, algorithms and workflows, nuances and strengths and were filled to the brim with ideas for using all of that data to streamline their companies and to serve their customers better.
Listening to them talk, these data scientists and data managers deeply understood how to think about data in the context of their organizations and how to evaluate the kinds of analyses they see from their staff or do themselves. The data science that I’ve seen in the Valley is an immensely nimble and fast-paced process focused on outcomes and the questions to be answered. In the best of projects data scientists have background domain knowledge in the area they are working and work hand-in-hand with the analysis consumer along an iterative process. Success is defined by whether the final analysis answers the question posed. In my own government experiences, I’ve had projects where I was not even allowed to talk with the government client, having to route every clarification or question through the prime contractor who would then translate the responses and relay them back to me, creating a lengthy game of telephone that severely impeded the natural analysis pipeline.
Government’s monolithic approach all but assures failure in the fast-paced world of software development and data analysis. The notion of precisely specifying up front every conceivable feature of a piece of software covering its usage over the coming decade or determining the entire workflow of an analysis before even knowing if the data exists to perform it, is simply not how the world works today. Instead of breaking projects into modular services with well-defined boundaries that are iteratively evolved with user feedback and A/B testing, government’s approach to predefining every aspect of a project costing hundreds of millions or even billions of dollars places the focus on process risk mitigation rather than outcome. When projects fail to deliver on their promise, government simply redefines its success metrics or hides behind claims of classification.
Decision makers, in turn, lack the ability to communicate their needs to data scientists or understand the complexities of data analysis. When I hear actual comments like “Just have Twitter do a keyword search for all the terrorist stuff and hit delete – it’s simple” or “Why do I need to pay people to read news in other countries, everything of importance is on CNN” even the best of data scientists would struggle to find traction in such an environment.
Moreover, those analysts are competing against a legion of politically-connected companies each pitching their products as the ultimate silver bullet to all the government’s data and analysis needs. From personal experience, it’s difficult to compete with a contractor pitching a tool that claims 99% accuracy at assessing worldwide populations from Twitter with magical proprietary algorithms that mitigate all of the geographic biases and couple it with 99% machine translation accuracy (99% seems to be a popular number in Washington) – senior decision makers rarely have the technical background to recognize farfetched claims and to design pilot experiments to vet them.
Perhaps the greatest obstacle is not simply the dearth of data scientists in Washington but the lack of independent scientists who come from outside the government world and mentality, who can bring objective balance to their assessments and who are familiar with how things are done in the “real” world rather than the insular government world. Especially those who are either interested in serving their nation through a career in government or are willing to bring their talents to government for a short while to help bring about change. The true believers of data science and innovation who are genuinely interested in helping their nation rather than accumulating titles or enriching themselves.
Truly infusing innovation into government requires bringing in not merely targeted “troubleshooters” to save failing projects, but to proactively reimagine how data can be used to transform the interaction of government and citizen. It requires people with the ability to think outside government’s narrow tunnel vision and develop creative solutions, yet having the skill and experience to execute those visions within the constraints of bureaucracy by thinking outside the box. Innovation isn’t just fixing a broken website or confusing incremental IT modernization with bringing government up to the level of the commercial world. Innovation is about fundamentally transforming how government serves its citizens. The country that brought us Mosaic and the modern graphical web and companies like Google, Amazon, Facebook and Twitter that have come to define the Internet era, is also the country that asks its citizens to mail it floppy diskettes through the post office. How is it that the country that has come to define the web uses technology to power its government that many liken to stepping back into a time machine, while in Estonia you can file your taxes with three mouse clicks or pull a fishing license from your phone while standing on the banks of a river?
Because in truth, 20 years after the dawn of the modern web “creating a health care website shouldn’t cost three quarters of a billion dollars in Washington and just an afternoon of volunteer coding in Silicon Valley.” Rather than turn its back to Silicon Valley’s creativity with phrases like “extraordinarily overqualified for government service,” Washington should embrace its unconventional innovation, not merely opening offices in the Valley but actually actively courting creative thinkers and empowering a small cadre of the most innovative to float across Washington as evangelists, seeding and nurturing creative ideas, bringing people together and helping to reimagine what government in the 21st century could look like. If the US Government wants to move beyond “innovation” as a buzzword perhaps it should spend a little less time on press releases touting itself as innovative and spend more time actually creating innovation. After all, as the saying goes, if you have to spend all your time telling people you’re innovative, while the rest of the world refers to you as a time machine to the past, perhaps it’s time to try something new.