Calibrate Your Moral Compass | Ethics in Data Science Part 3

This is the third and final installment of a series titled “Ethics: A Data Scientist’s Perspective.” If you haven’t read the first two parts, do so now here and here

Let’s recapitulate what we’ve covered so far. The first post in this series was all about the solutions up front. The second post was very doom-and-gloom about what was wrong about some of the software-related rallies started in reaction to the Volkswagen scandal.

For this final post, we are going to dive deeper into the weeds of the positive tips and tricks for staying sane in an ethically-questionable world. Or rather, what I personally do to try to stay grounded in what I believe is ethical and moral.

Control and Consequences

Given all of these reasons for why the Volkswagen scandal happened, what can we do about it? Why did it happen in the first place? Darden Professor Luann Lynch with co-author Carlos Santos succinctly described the situation in their article on business ethics:

Darden Professor Luann Lynch peeked under Volkswagen’s hood to determine just how Germany’s premier car maker managed to destroy its once sterling reputation.

Her case study titled The Volkswagen Emissions Scandal, co-authored with former Darden students Elizabeth Bird and Cameron Cutro (both MBA ’16), suggests that a combination of autocratic leadership and lack of both controls and consequences led to a corporate culture that proved fertile ground for bad decisions.

When I compare what I’ve read about the behind-the-scenes stories of what happened before, during, and after major scandals, I see Lynch’s pattern. Not only does Lynch’s explanation remind me of Volkswagen; it also describes Enron. Perhaps it didn’t cause the scandal, but increased its probability.

The problem still remains: what does a single data scientist do to protect against this situation? Regulations, self-governing professional institutions, codes of conduct — all of these elements were present and available at the time of the Volkswagen scandal.

 Think Like an Engineer

If you are a software developer working in a company with autocratic leaders, and there are not controls or consequences for your actions, how do you calibrate your moral and ethical compass?

 

In the first post in this series I shared the quick matrix that represents the choices you may need to make when facing an ethically-questionable environment. The best place to start is the middle, reinforcing the internal sense of right and wrong. We will start there.

Let’s start with what I think are basics for any person writing code can practice: reproducibility, transparency, and honest feedback.

Reproducibility, Transparency, Feedback

Reproducible 

As much as is reasonable and practical, I make my work reproducible. I follow code standards, such as Google’s R Style guide or PEP8 to increase legibility and modularity of the code. Someday I may need to take a break from my current code and come back to it months later. Will I understand the mess enough to pick up where I left off? Can others pick up easily in my place? Whenever I can I document the crap out of what I do. I need to assume the project will be successful enough to last forever, and archeologists will someday review it and want to understand it without requiring a cryptologist to decode it. I find using templates like Cookie Cutter Data Science to be a terrific way to streamline the work and ensure everyone knows where to find the right files, even across projects and clients.

Transparent 

There is very little about our work in the world of data science that is proprietary and secret. Linear regression? Algebra? These aren’t patent pending technologies you need to hold close to the vest. The data science and machine learning and AI communities are extremely open and transparent about the work they do. I don’t know if in the future a client, peer, or archeologist will crack open my code and critique it, but I hope what they find is worthy of their respect and admiration.

Even in the middle of a project I absolutely love that using the Cookie Cutter Data Science approach in a private GitHub repo renders my Jupyter notebooks to HTML. This means I can quickly visit a webpage to pull up the full analysis, including charts and write-up, and share it with coders and non-coders alike. Even better is at the end of the project I can hand over the entire repo where much of the documentation comes included.

Honest Feedback

 It is vital that I solicit and accept feedback. As much as is reasonable and practical, the work I do should be transparent enough that I can solicit feedback. In order to ask for and receive great feedback, I also find it helpful to know how to give feedback to others and reciprocate. I’ve covered this topic repeatedly in my role in Toastmasters International.

In my view, the above three elements — honest feedback, transparency, reproducibility — were a core part of the methodology that allowed a “little lab in West Virginia” to catch Volkswagen’s cheat, according to an article on the topic on NPR.

WVU research assistant professor Arvind Thiruvengadam and his colleagues test and experiment on cars and engines…. Volkswagen had the boldest claims and the highest sales, but Thiruvengadam tested two VW cars and found that the claims of low emissions never panned out in the real world. […]

He says the team kept double-checking its procedures. “And then, I mean, we did so much testing that we couldn’t repeatedly be doing the same mistake again and again,” he says.

Volkswagen was cheating. That’s what everyone in the project began to suspect but wouldn’t dare to say out loud.

“It’s the sort of thing you just don’t go around accusing companies of doing unless you’re absolutely sure,” says John German, with the International Council on Clean Transportation — the group that commissioned the test.

Without a system of transparency, reproducibility, and honest feedback among peers in the group and outside of the research team, Thiruvengadam and his colleagues would have had a very hard time convincing anyone to listen to their claims that they had caught Volkswagen cheating.

Bring It Together

You may find yourself in an environment where it is not common to practice these three elements. You should at least be able to put them into practice yourself, and hopefully others around you will catch on. If that’s not working then you probably shouldn’t stay too long in that environment.

Having work that is reproducible, transparent, and improved through honest feedback will not make you and your colleagues more ethical or moral by itself. Their absence from your workflow, however, is a bad sign. It suggests there are fundamental problems working against your ability to do good work, work you can be proud of. It may also suggest a lack of basic controls and consequences throughout the department. This could lead to problems that aren’t malicious but accidental. Finally, lack of controls and consequences, according to Lynch, were key ingredients to the recipe that created the Volkswagen scandal.

Conversations

Assuming you have these three elements as part of your basic workflow as a data scientist, we can move on to the strengthening of your moral compass through conversations — with other humans, with tribes, with reading and videos, and with yourself.

Humans

There are many ways to engage in conversation. You can even do it by yourself. Having conversations that challenge and reinforce what you personally believe is right and wrong is very healthy and important.

Tribes

Another tangential part of the above is associating with a tribe that has a very strong sense of self, and a very strong moral compass. One activity I look forward to every year is attending DEFCON. You would think a bunch of hackers and security advocates would have a moral ambiguity. I found the opposite to be true. The things I learn each remind me to stay vigilant. They also encourage me to have dialog and debates with others about what is right and wrong. The friends that I meet with every year at this conference have the strongest sense of personal ethics and morals of any people I’ve ever met. Spending time in a tribe that challenges and reinforces my sense of right and wrong helps to vaccinate me against moral and ethical dilemmas.

Reading Blogs

 I find it fascinating how reading the same thing repeatedly gives you different results, largely thanks to the mood and context you bring. Your current mood and personal, historical narrative reflect against the words and ideas you are reading in this paragraph. It is through engaging in a dialog with text like this, or in conversation with other humans, or even with yourself, which helps to strengthen your sense of morals and ethics.

Reading Encyclopedia of Ethical Failure

Speaking of ethical and moral dilemmas, a great resource that is updated regularly is the Encyclopedia of Ethical Failure.

The Standards of Conduct Office of the Department of Defense General Counsel’s Office has assembled the following selection of cases of ethical failure for use as a training tool. Our goal is to provide DoD [Department of Defense] personnel with real examples of Federal employees who have intentionally or unwittingly violated the standards of conduct. Some cases are humorous, some sad, and all are real. Some will anger you as a Federal employee and some will anger you as an American taxpayer.

I decided not to include a direct link because they tend to break frequently as government sites reorganize and rarely provide proper redirect links. As I write this I found the last full version to be the 2015 addition, along with 2016 and 2017 updates posted on the Resources page of the Department of Defense Standards of Conduct Office. I encourage you to do a Google Search for the latest copies. This guide is absolutely worth a read.

Fork Your Own Code

Remember how I said that the data science and software communities already had self-governing bodies and regulations and standards?

Assuming these factors are correct, what can one data scientist or software programmer hope to do to avoid being caught up in a scandal such as this?

You have an opportunity to create your own code of ethics for your tribe. Here is how you would go about doing this.

For starters, you can simply do an online search for the word “ethics” or “code of conduct” plus whatever keyword fits your role. In my case I would search “data science code of ethics.” Results that I found included these excellent starting points:

 

http://www.datascienceassn.org/code-of-conduct.html

https://www.acm.org/code-of-ethics

 

I also found this great article on O’Reilly about how to mitigate unwanted side effects when performing data analysis by asking these four questions:

1. Are the statistics solid?
2. Who wins? Who loses?
3. Are the changes in power structures helping?
4. How can we mitigate harms?

With these links and search queries as starting points, you can find the best version that works for you, fork a copy of it, and make edits so that it fits best for you and your tribe.

Front Page News

Probably the simplest explanation of a test for ethical or moral fortitude comes from Omaha Sage Warren Buffett:

“Do nothing that you would not be happy to have an unfriendly but intelligent reporter write about on the front page of a newspaper.” — Warren Buffett

K-GULF

Lastly, I’d like to share an acronym I use personally, distilled from what I learned reading What Color’s Your Parachute? over a decade ago. The acronym is K-GULF.

When faced with having to make a decision among two options, if the right solution is unclear, I use K-GULF to figure out the right path. “Which option maximizes the amount of kindness, generosity, understanding, love, and forgiveness in the universe?” If I choose an option other than the answer to that question, I better have a damn good reason for it.

Please keep in mind that morals and ethics are not fundamental laws of nature; they’re part of an ever-evolving narrative we hold in our minds and share with other humans.

What I’ve shared in this series is my take on the topic as of the time I wrote it, but I’m only one voice. What you’ve read above is my opinion, based on what has personally worked for me up until now. Your own mileage may vary. Please don’t take my word for it, try it yourself, and use what works for you. Good luck!

Mike Zawitkowski is a full stack data scientist. He has worked with big data and machine learning problems since 1999—before “big data” and “machine learning” were trendy.

If you want to work with the Acorn Team fill out the Contact Us form. 

 

Leave a Reply

Your email address will not be published. Required fields are marked *