David Merrell's BlogsiteI am pursuing a PhD in Computer Sciences at UW-Madison. This blog/website shows some of the things I've been up to.
/
Thu, 07 Sep 2017 17:00:27 +0000Thu, 07 Sep 2017 17:00:27 +0000Jekyll v3.5.2Algorithmic Fairness Research gets Covered in the News<p>Our algorithmic fairness group has been blessed with fame and riches.</p>
<p>A few days ago, this article went up on the University of Wisconsin main
webpage:</p>
<p><a href="http://news.wisc.edu/uw-madison-researchers-tackle-bias-in-algorithms/">UW-Madison researchers tackle bias in algorithms</a></p>
<p>I’m proud of myself for not looking dumb in the photo, or sounding dumb
in my quote.</p>
<p>So that’s the aforementioned “fame”. The “riches” refer to a $1M NSF grant
awarded to my advisor, Aws Albarghouthi, a couple of months ago.
My research will be funded with
this grant; I am very grateful to Aws and the NSF for this opportunity.</p>
<p>\( \blacksquare\)</p>
<p><strong>2017-07-12: Since posting, our research has gotten modest exposure in some other news outlets. I’m gathering links to articles here:</strong></p>
<p><em>Wisconsin State Journal—<a href="http://host.madison.com/wsj/news/local/govt-and-politics/uw-software-aims-to-find-and-fix-biased-computer-programs/article_7f261c21-a107-5841-92b6-9ffbd69eca9a.html">UW software aims to find and fix biased computer programs</a></em></p>
<p><em>TechRepublic—<a href="http://www.techrepublic.com/article/fairness-verification-tool-helps-avoid-illegal-bias-in-algorithms/">Fairness-verification tool helps avoid illegal bias in algorithms</a></em></p>
<p><em>StarTribune—<a href="http://www.startribune.com/wisconsin-researchers-awarded-grant-to-fix-algorithmic-bias/433648293/">Wisconsin researchers awarded grant to fix algorithmic bias</a></em></p>
<p><em>San Francisco Chronicle—<a href="http://www.sfchronicle.com/news/article/Wisconsin-researchers-awarded-grant-to-fix-11277846.php">Wisconsin researchers awarded grant to fix algorithmic bias</a></em></p>
Fri, 07 Jul 2017 19:20:00 +0000
/research/2017/07/07/wisconsin-news.html
/research/2017/07/07/wisconsin-news.htmlfairnessnsfgrantfundingnewsresearchPaper Accepted to IJCAI 2017<p>I have been negligent in my blogging. But I have been busy, so I don’t
feel too bad.</p>
<p>One of the more interesting things that’s happened to me in the past seven months
is my advisors and I getting a paper accepted to the
<a href="https://www.ijcai.org">International Joint Conference on Artificial Intelligence (IJCAI)</a>.
Very exciting! I will be attending the
<a href="http://www.ijcai-17.org">conference (in Melbourne, Australia)</a>, toward the end of August.</p>
<p>It’s related to work mentioned in previous posts, about
Symbolic Volume Integration (in the paper, we call it Weighted Model
Integration). We worked out a way to make our method
converge more efficiently.</p>
<p>It’s kind of neat—it relies on QR-factorization, Givens rotations,
and Pythagorean triples. It encounters some fundamental principles
of probability, i.e., the Skitovitch-Darmois Theorem.</p>
<p><a href="/assets/dmerrell-ijcai-2017.pdf">Here’s a link to the camera-ready submission.</a></p>
<p>\( \blacksquare\)</p>
Fri, 07 Jul 2017 19:00:00 +0000
/research/2017/07/07/IJCAI-submission.html
/research/2017/07/07/IJCAI-submission.htmlprobabilityintegrationinferenceIJCAIpaperresearchCS 761 Project: Spectral Methods for Latent Variable Models<p>One of my courses this past semester was <a href="http://pages.cs.wisc.edu/~jerryzhu/cs761.html">CS761</a>,
“Advanced Machine Learning”.
The idea of the course was to present some of the mathematical and statistical
theory underlying machine learning. It was taught by
Dr. <a href="http://pages.cs.wisc.edu/~jerryzhu/">Xiaojin (Jerry) Zhu</a>.</p>
<p>One of the things we had to do was a course project. Since CS761 was a
theory course, the project needed to be of a theoretical nature. Rather than
merely <em>applying</em> machine learning to some problem, we were expected to peruse the
theoretical machine learning literature and try making some contribution to the
research.</p>
<p>Three weeks before the semester ended—after much procrastination—I
finally buckled down and got to work. I teamed up with
<a href="https://www.cs.wisc.edu/people/parikshit">Parikshit Sharma</a>; we had
both, independently, stumbled on the topic of <em>spectral methods</em> for
latent variable models, and decided to pursue possibilities in that area.</p>
<p>The main idea is that we can sometimes learn the parameters of
a Bayesian network by computing a <em>spectral decomposition</em>
of certain <em>empirical moment</em> tensors obtained from the data.
Parikshit and I tried to do this for new kinds of Bayesian networks.
We were not ultimately successful; however, the project was still
an instructive exercise, and I feel like I got pretty familiar with
the subject matter.</p>
<p>I felt kind of proud of our work. And it got a decent score. So I’ve decided to post it here.</p>
<p><a href="/assets/merrell-sharma-cs761-project-2017.pdf">Here’s a link to our completed project.</a></p>
<p>\( \blacksquare\)</p>
Fri, 05 May 2017 19:00:00 +0000
/coursework/2017/05/05/advanced-ML-project.html
/coursework/2017/05/05/advanced-ML-project.htmlmachine learningbayesiantensorcourseworkCompleted StarCraft Optimization Project<p>Well, it’s the end of the semester.</p>
<p>I’m kind of disproportionately proud of the project I did for my
optimization class; I described it in a
<a href="">previous post.</a></p>
<p><a href="/assets/dmerrell-writeup.pdf">Here’s a link to my finished report.</a></p>
<p><a href="/assets/sc-mip-model.gms">Here’s a link to the GAMS code that I wrote for this project</a>
\( \blacksquare\)</p>
Tue, 06 Dec 2016 22:50:00 +0000
/coursework/2016/12/06/StarCraft-MIP-completed.html
/coursework/2016/12/06/StarCraft-MIP-completed.htmloptimizationstarcraftinteger programmingcourseworkOptimization Project: Build Orders in StarCraft<p>I’m taking a course in optimization this semester. It includes a project.</p>
<p>I’ve always been kind of fascinated by real-time strategy games from
a theoretical standpoint. Growing up I was never very good at playing them, but they
captured my imagination. Players face an immense decision space; it’s clear
that there are good and bad choices to take, but trying to identify the
<em>best</em> choice always seemed kind of unobtainable.</p>
<p>About 1.5 years ago, I got the idea in my head to try making an AI for StarCraft.
I tried using the StarCraft Brood War API, but the project eventually fell to the
wayside as I (a) realized the scale of that project and (b) got distracted by
other things.</p>
<p>With these prior experiences in mind, I decided to tackle one particular aspect of StarCraft for my optimization project—
early game build orders. I’ll try to answer this question: what choices maximize
military strength within the first few minutes of play? This is useful to know,
whether you’re trying to rush your opponent or defend yourself from a rush.</p>
<p><a href="/assets/dmerrell-proposal.pdf">Anyways, here’s a link to my proposal.</a>
Hopefully it gets approved and I can get to work on it.
\( \blacksquare\)</p>
<p>Note: Interestingly, Google’s DeepMind and Blizzard Entertainment made
<a href="https://deepmind.com/blog/deepmind-and-blizzard-release-starcraft-ii-ai-research-environment/">a relevant announcement yesterday</a>.
They’re collaborating to open StarCraft II as a testbed for AI research.
I couldn’t agree more with their reasoning that real-time strategy games are an
excellent environment for agent design.</p>
Sun, 06 Nov 2016 04:50:00 +0000
/coursework/2016/11/05/StarCraft-MIP.html
/coursework/2016/11/05/StarCraft-MIP.htmloptimizationstarcraftinteger programmingcourseworkRevisiting the Stanford AI100 Study<p>The first homework assignment for the Intro to AI class ended up
being an essay. We were each tasked with writing a 500-600 word
critique of the Stanford AI100 article. In the end, I decided not
to try critiquing the article on any technical basis; I didn’t
feel qualified to do that. Rather, I decided to focus on some aspects
of the article’s scope. Here’s the result:</p>
<blockquote>
<p>The Stanford AI100 Study seeks to record and influence the impact of artificial
intelligence in a series of articles over the course of 100 years. This
ambitious endeavor is overseen by a Standing Committee who periodically select
a Study Panel and charge them with the task of producing a written article.
The first Study Panel–appointed in 2015–recently published the very first
installment of this series in a report titled “Artificial Intelligence and
Life in 2030”. Its scope is to describe the opportunities and challenges
that may arise from AI in the next fifteen years, restricted to society in
North American cities. Given the uncertainty inherent to technological
advancement and the universality of AI’s significance for humanity, the
Standing Committee’s framing choices are overly speculative and
overly parochial.</p>
</blockquote>
<blockquote>
<p>The Study Panel’s mission to anticipate the status of technology in 2030 is
misguided, given the uncertainty inherent to technological progress. Most
technological advancement is accounted for by unexpected, improbable
discoveries; “black swan” events, as Nassim Taleb calls them. As such,
there are strong limitations on anyone’s ability to predict technological
progress. To the Study Panel’s credit, they don’t ever seem to let this
forecasting requirement guide their analyses in any concrete way; every
reference they make to the year 2030 could easily be replaced by a more
conservative reference to “the near future”. However, the fact that the
Standing Committee framed the report in this way suggests an unfounded optimism
in the Study Panel’s ability to forecast technological advancement. It would
be more honest to frame future AI100 reports in a way that emphasizes assessing
the current state and identifying immediate opportunities, rather than
attempting forecasts ten or more years into the future.</p>
</blockquote>
<blockquote>
<p>The report’s restriction to North American cities unnecessarily limits its
perspective; it reflects the homogeneity of the Standing Committee and Study
Panel. Artificial intelligence poses opportunities for all of mankind, and
challenges all of mankind with important questions. It is important to avoid
parochialism when surveying its implications. The provincialism of the first
report is undoubtedly related to the homogeneous composition of the people
involved. Five of six Standing Committee members are Americans; the sixth is
Canadian. Furthermore, fourteen of seventeen Study Panel members are current
long-term residents of the US; the fifteenth is Canadian. None of them are from
East Asia or continental Europe. Developing regions, such as Africa or South
America, have no representation. It is unclear whether the Study Panel was
appointed before or after the scope was specified; whatever the case, the
Standing Committee’s choices centered myopically on their own piece of the
world. While there is some rationale for starting small and “sticking to what
you know”, there is also risk associated with homogeneous perspective. It
blinds the group to “unknown unknowns”, increasing the fragility of its work
with respect to unforeseen contingencies.</p>
</blockquote>
<blockquote>
<p>“Artificial Intelligence and Life in 2030” is only the first of many
AI100 reports, and the next will likely be more global in scope. However,
it was disappointing to see the first report framed in such an awkward
fashion—culturally narrow and temporally distant. The Standing Committee
ought to frame future reports in a more universal fashion and ought to appoint
more heavily diversified Study Panels to write them. This may require the
Standing Committee to form new connections, with people who are unfamiliar to
them. However, if AI advancement is of fundamental importance,
and if the Stanford AI100 Study wishes to provide authoritative guidance in
this field, then it is crucial for the Standing Committee to avoid tunnel
vision or unproductive speculation as it frames these studies.</p>
</blockquote>
<p>\( \blacksquare\)</p>
Sat, 01 Oct 2016 10:30:00 +0000
/coursework/2016/10/01/stanford-100-revisited.html
/coursework/2016/10/01/stanford-100-revisited.htmlstanfordreadingaimachine learningai100courseworkTilt Equivalence in Symbolic Volume Integration<p>I’ve been continuing my involvement in “Algorithmic Fairness” research.
In particular, I’ve been working on an improvement to the Symbolic
Volume Integration (SVI) method mentioned in a <a href="/research/2016/09/05/symbolic-volume-integration.html">previous post</a>.
The idea is to modify SVI in a way that makes it converge
faster. While the previous SVI method depends on the construction of rectangles
that are aligned with the axes, we’re aiming to allow rectangles that
are “rotated” or “tilted” with respect to the axes.</p>
<p>To that end, I showed that rotating a rectangle for SVI is equivalent to
applying the inverse rotation to the region of integration.</p>
<p><a href="/assets/tilt-equivalence.pdf">Here’s a link to the writeup.</a>
\( \blacksquare\)</p>
Sat, 01 Oct 2016 10:00:00 +0000
/research/2016/10/01/tilt-equivalence.html
/research/2016/10/01/tilt-equivalence.htmlwisconsinalbarghouthidantonidrewsmicrosoftnorifairnessresearchThe Stanford One Hundred Year AI Study - 2016<p>In 2014 Stanford began a <a href="https://ai100.stanford.edu/">100-year study</a>
on the societal implications of artificial intelligence. Over
the next 100 years a (gradually changing) board of top AI researchers
will publish periodic papers giving a bird’s eye view on the current state of
AI research. The purpose of these papers is to record AI’s effect on
humanity, anticipate future developments, and inform policy makers of
relevant issues; to separate substance from sensationalism.</p>
<p>Five days ago, Stanford released the first of these papers
<a href="https://ai100.stanford.edu/sites/default/files/ai_100_report_0901fnlc_single.pdf">(pdf)</a>.
It’s worth reading yourself—there’s an executive summary if you’re short on
time. In this post I’ll list some of the things that stood out to
me as I read through it.</p>
<ul>
<li>
<p>In this first paper, the researchers narrow their scope to North American
cities. That is, they describe the current impact of AI on the inhabitants of
North American cities, and envision possible developments to the year 2030.
While the One Hundred Year Study has international ambitions, this is a
practical starting point given the study’s newness and North America’s
comparative development.</p>
</li>
<li>
<p>The paper covers many aspects of society—transportation, healthcare,
education, public safety, employment, entertainment, home life—though they
leave out defense and military. The study acknowledges this gap, asserting that
military applications fall outside the scope of North American cities. It seems likely that a thorough treatment of military applications would have dominated
the paper. It also seems likely to me that they had difficulty accessing
information on the state of military tech, given their intention of
publicizing it.</p>
</li>
<li>
<p>The paper dedicates much discussion to “low-resource communities”, describing
the effect of AI on disadvantaged demographics. It frequently stresses the
importance of ensuring that our technologies are unbiased and do not perpetuate
unfair discrimination. This makes my budding involvement in algorithmic fairness
feel relevant. So that’s nice.</p>
</li>
<li>
<p>The paper notes the recent trend of data-intensive machine learning
(e.g. “deep learning”) displacing most other lines of inquiry, but
suggests that it is worthwhile to recognize the limitations of this path.
As someone who feels that the current trend does little to improve our
understanding of intelligence (though I acknowledge the usefulness of deep
learning), I felt validated when the paper gave the following advice:</p>
</li>
</ul>
<blockquote>
<p>We encourage young researchers not to reinvent the wheel, but rather to
maintain an awareness of the significant progress in many areas of AI
during the first fifty years of the field, and in related fields such as
control theory, cognitive science, and psychology.</p>
</blockquote>
<p>Well, that’s all for now.</p>
Wed, 07 Sep 2016 01:00:00 +0000
/coursework/2016/09/06/stanford-100-year-16.html
/coursework/2016/09/06/stanford-100-year-16.htmlstanfordreadingaimachine learningai100courseworkProving Algorithmic Fairness II - Symbolic Volume Integration<h1 id="background">Background</h1>
<p>This is a walk-through explanation of the Symbolic Volume Integration method
mentioned in
<a href="/research/2016/09/01/proving-algorithmic-fairness.html">Part I</a>
of this Algorithimic Fairness post series.</p>
<p>In order to prove or disprove the fairness of an algorithm, we must prove or
disprove the following inequality:</p>
<script type="math/tex; mode=display">\frac{P[\mathcal{P}(\vec{v}) \; \land \; v_s = true] \cdot P[v_s = false]}{P[\mathcal{P}(\vec{v}) \; \land \; v_s = false] \cdot P[v_s = true]} > 1 - \epsilon</script>
<p>where \(v_s \) is an entry of \(\vec{v}\) indicating whether the person
belongs to a protected class—e.g., a particular religion or ethnicity—and
\(\epsilon < 1\) is some agreed-upon or mandated standard of fairness, with
smaller \(\epsilon\)s implying stricter fairness requirements.</p>
<h1 id="motivation-proof-beats-confidence">Motivation: Proof Beats Confidence</h1>
<p>In the world of scientific computing, probabilities—like those in the above
inequality—are typically
estimated using Markov Chain Monte Carlo techniques, which are known to perform
well in a variety of cases and allow the underlying process to be treated as a
black box. Due to the stochastic nature of MCMC, any conclusion based on it must
be stated in terms of statistical certainty. In this respect, the new Symbolic Volume
Integration method has an advantage
over MCMC—it allows a <em>proof</em> of the fairness condition,
rather than a statement of statistical confidence. With proof within
convenient reach, it is unjustifiable to settle for mere confidence.</p>
<p>Symbolic Volume Integration can prove (or disprove) the fairness
inequality because it guarantees a lower bound on its integral.
Note that in a probabilistic setting the method can also be used to compute
upper bounds on probabilities; if we need an upper bound on
\(P[A] = 1 - P[\neg A]\), it suffices to find a lower bound on
\(P[\neg A]\). By finding lower bounds on the probabilities in the numerator
and upper bounds on the probabilities in the denominator, we can prove the
fairness condition. Likewise, upper bounds on the numerator and lower
bounds on the denominator would allow us to disprove the fairness condition.</p>
<p>While Symbolic Volume Integration allows us to obtain a proof of algorithmic
fairness, it does require an intimate knowledge of the algorithm in
question. A black box representation of the algorithm will not suffice;
Symbolic Volume Integration requires the contents of the algorithm so that it
can translate them into a set of logical constraints<sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup>.</p>
<h1 id="the-algorithm">The Algorithm</h1>
<p>Symbolic Volume Integration consists of the following steps:</p>
<ul>
<li>Translate Code into a Predicate Logic Formula
<ul>
<li>Assuming that the population model and decision program can
both be expressed as sequences of assignments, probabilistic
assignments, and conditional expressions involving simple arithmetic
(addition and multiplication), we can translate their sequential,
imperative commands into conjunctions of declarative logical statements.</li>
<li>Since we ultimately care about the <em>composition</em> of the population
model and the decision program—letting the outputs of the population
model be the inputs of the decision program—we can create one big
logic formula by conjoining their individual logic formulas.</li>
<li>We conjoin additional requirements to the formula, in order
to specify the possibility for which we wish to compute probability.
For example, if we wanted to compute
\(P[\text{hired} = true \land \text{age} \ge 60]\)
, we would conjoin \((\text{hired} = true \land \text{age} \ge 60)\)
to the formula. We now have a Big Formula.</li>
<li>For each probabilistic assignment in the program, you can think of there
being a free variable in our big formula. E.g. if a random value for
“age” is generated in the population model, it would be a free variable
in the logic formula.</li>
<li>The free variables just mentioned form a real space, \(\mathbb{R}^n\)
(where \(n\) is the number of free variables). Only certain combinations
of these variables will actually satisfy the big logic formula we constructed.
It turns out that the valid combinations form a region in \(\mathbb{R}^n\),
which we will call the “admissible region”. We can find points in the admissible
region by using a SMT solver (“Satisfiability Modulo Theories”; in our case,
we use an SMT solver called Z3). Bear in mind that the free variables
are probabilistically assigned, so each of them has an associated PDF. In
combination, this implies a joint PDF over \(\mathbb{R}^n\) formed by
their product (they are independently distributed).
Hence, computing probability for a given scenario entails
integration of the joint PDF over the scenario’s admissible region.</li>
</ul>
</li>
<li>Decompose the Admissible Region into Hyperrectangles</li>
<li>Integrate over the Hyperrectangles and</li>
</ul>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>Proving an algorithm’s fairness would in practice require access to the algorithm’s source and parameters. Perhaps future regulators would use black box MCMC methods to identify possible violations of fairness and then request the necessary details as part of an audit. <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>
Tue, 06 Sep 2016 01:00:00 +0000
/research/2016/09/05/symbolic-volume-integration.html
/research/2016/09/05/symbolic-volume-integration.htmlwisconsinalbarghouthidantonidrewsmicrosoftnorifairnesssmtlogicintegrateresearchProving Algorithmic Fairness I - Introduction<h1 id="overview">Overview</h1>
<p>This semester (Fall 2016) I’m beginning an independent study course with
<a href="http://pages.cs.wisc.edu/~aws/">Dr. Aws Albarghouthi</a> of UW-Madison’s
CS department. His research focuses on the “art and science of program
analysis.”</p>
<p>During this course, we plan to investigate an idea called “algorithmic
fairness.” It is interesting for both its technical content and
cultural relevance. Here’s an example illustrating algorithmic
fairness’s place in the big picture:</p>
<p><em>Suppose you’re hiring employees for your business, and you’ve decided
to use a fancy machine learning algorithm to sift through piles of application
documents and select candidates for interviews<sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup>. In hiring, it is ethically
correct (and legally necessary) to be unbiased with respect to protected
classes—e.g., race or religion. Can you guarantee that your algorithm is
unbiased?</em></p>
<p>In order to get me acquainted with this topic, Dr. Albarghouthi pointed me
to a paper he’s been collaborating on, titled “Proving Algorithmic
Fairness”<sup id="fnref:2"><a href="#fn:2" class="footnote">2</a></sup>. There is already a body of work devoted to the topic of
algorithmic fairness<sup id="fnref:3"><a href="#fn:3" class="footnote">3</a></sup>, but this paper introduces the following innovations:</p>
<ul>
<li>
<p>It presents the notion of fairness <em>with respect to a <strong>population model</strong></em>.
A population model can be thought of as a “random person generator”, drawing
from a joint distribution over the space of demographic traits. In our hiring
example, it would describe the population of possible applicants. The
population model is a useful construct in that it provides a standard
by which to judge fairness. For example, if the population of possible
applicants has a certain ethnic composition, we can judge the fairness of
an algorithm by comparing the ethnic composition of its output with the
population’s ethnic composition. Previous literature on the subject
judges fairness with respect to particular datasets instead; it is argued
that these datasets may possibly be biased themselves. It is envisioned that
social scientists in government agencies or NGOs would prepare these models.</p>
</li>
<li>
<p>The paper introduces a new-fangled integration method for its computation
of probabilities. It’s described as a symbolic volume-computation algorithm
that uses an SMT solver. The paper’s new method is preferred over more
typical Markov Chain Monte Carlo integrators because it guarantees
a lower bound for the integral; this guarantee allows us to <em>prove</em>
fairness or unfairness, rather than express a mere statistical <em>confidence</em> in
fairness or unfairness.</p>
</li>
<li>
<p>The concepts of the paper are packaged into a fairness verification tool
called <em>FairSquare</em>, which is tested against a set of benchmark population
models and classifier algorithms.</p>
</li>
</ul>
<h1 id="proving-fairness">Proving Fairness</h1>
<p>Suppose we have a binary classifier algorithm \(\mathcal{P}\), whose input is
a person \(\vec{v}\) (a vector of that person’s traits) and whose output is
“true” or “false”, “hired” or “not hired”, etc. A proof of fairness for
algorithm \(\mathcal{P}\) consists of showing that the following inequality
holds:</p>
<script type="math/tex; mode=display">\frac{P[\mathcal{P}(\vec{v}) \; | \; v_s = true]}{P[\mathcal{P}(\vec{v}) \; | \; v_s = false]} > 1 - \epsilon</script>
<p>where \(v_s \) is an entry of \(\vec{v}\) indicating whether the person
belongs to a protected class—e.g., a particular religion or ethnicity—and
\(\epsilon < 1\) is some agreed-upon or mandated standard of fairness, with
smaller \(\epsilon))s implying stricter fairness requirements.
In English: if we have two people who are equal in every way except their status
in a protected class, we must show that the algorithm is equally likely to
approve both people (within some threshold).</p>
<p>Note that it isn’t sufficient for the algorithm to
simply ignore protected class data; correlations between protected
class data and hiring criteria can lead to unbalanced outcomes even if
ethnicity or religion are simply “left out” of a hiring algorithm.</p>
<p>In proving this inequality, it is useful to eliminate the conditional
probabilities via the identity \(P[A | B] = \frac{P[A \land B]}{P[B]}\),
giving the inequality</p>
<script type="math/tex; mode=display">\frac{P[\mathcal{P}(\vec{v}) \; \land \; v_s = true] \cdot P[v_s = false]}{P[\mathcal{P}(\vec{v}) \; \land \; v_s = false] \cdot P[v_s = true]} > 1 - \epsilon</script>
<p>The probabilities of intersections are easier to directly compute than
conditional ones.</p>
<p>In order to prove the inequality, it suffices to find a lower bound
\(> 1 - \epsilon\) on the
LHS; hence it suffices to find lower bounds on the probabilities in the
numerator, and upper bounds on the probabilities in the denominator.
Similarly, in order to disprove the inequality (i.e. prove unfairness),
it suffices to find an upper bound on the numerator and a lower bound on
the denominator. This pursuit of upper and lower bounds lends itself to the
paper’s Symbolic Volume Integration scheme, which is proven to converge to
exact integrals in a monotonically increasing manner.</p>
<p>In upcoming posts, I will dig into the content of this paper in more detail.
Topics will include the paper’s Symbolic Volume Integration scheme and a
description of the <em>FairSquare</em> tool’s performance on some benchmarks.
\( \blacksquare\)</p>
<p> </p>
<p> </p>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>Using an algorithm is a good idea not only for the obvious speed considerations, but also for consistency; Daniel Kahneman’s research in behavioral psychology has shown that even simple classifier algorithms are more reliable for candidate selection (among other things) than human judgment. See <a href="https://www.amazon.com/Thinking-Fast-Slow-Daniel-Kahneman/dp/0374275637/ref=sr_1_1?ie=UTF8&qid=1329063030&sr=8-1"><em>Thinking, Fast and Slow</em></a>, Part III, chapter 21: Intuitions vs. Formulas. <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
<li id="fn:2">
<p>Albarghouthi, D’Antoni, Drews, Nori; in submission as of this writing (2016-09-01). <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
<li id="fn:3">
<p>See these for example: <a href="https://arxiv.org/abs/1104.3913"><em>Fairness Through Awareness</em></a>; <a href="https://arxiv.org/abs/1412.3756"><em>Certifying and Removing Disparate Impact</em></a> <a href="#fnref:3" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>
Fri, 02 Sep 2016 01:00:00 +0000
/research/2016/09/01/proving-algorithmic-fairness.html
/research/2016/09/01/proving-algorithmic-fairness.htmlwisconsinalbarghouthidantonidrewsmicrosoftnorifairnessresearch