Errata & Additions

Fixes applied to V.99 to create V.991

Acknowledgments

Thanked Geoff Amor, copy editor for Cambridge University Press.

Preface

Introduction

Last sentence of introduction: Typo “hat” changed to “that”

Part I

Part II

Chapter 4, Section 4.3
Clarified ending of 3rd bullet of last itemization. Now ends, “or use the genre itself within the user interface.”

Chapter 4, Section 4.6
Slight revision of last paragraph to this: “We concluded this section’s examples with COVID-19 mortality prediction to show our humility in the face of very difficult problems. However, we do hold out hope that this data science application could significantly improve with more data and effort.”

Chapter 5, Section 5.5
Reference 138’s URL changed to www.datascienceincontext.com/seer to allow us to redirect readers to site updates.

Chapter 7
Added headings for the five ethical examples.Part III

Chapter 10, Section 10.3

Changed “imaging what an attacker could do.” to “imagining what an attacker could do.”

Second to last sentence in fourth paragraph from the end changed to:  “A strong defense against stealing a model’s training data is not to have a single model, but rather to train and deploy an ensemble of models.

Chapter 11, Section 11.3

“Equations such as” changed to “Theories expressed as equations such as”

Chapter 11, Section 11.4.4

At the end of Item 4, “Prosecutor’s fallacy,” we added, “To avoid falling for this fallacy, we again advise working through an example.”

At the end of Item 9, “Clustering illusion,” we added, “To avoid this, remember that things that may seem out of the ordinary may not actually be so. Regrettably, certain probabilities are sometimes counter-intuitive.”

At the end of Item 13, “Survivorship bias,” we added, “To avoid this, pay careful attention to studies that extend over a long period of time, and consider carefully what would happen if there was data on all the initial participants.”

Chapter 12, Section 12.1

At the end of the paragraph which begins, “Even when using a good proxy metric…”, added sentence, “The book, System Error, contains many examples of situations where there has been excessive focus on optimization using insufficient proxies.” followed by a citation to Sahami et al.

Chapter 12, Section 12.4.1

Sentence with the word “itself” is changed to, “A system’s ability to quantify impacts and tune itself and thus become ever more effective.”

Part IV

Chapter 16, Recommendation 3

Added this element to the Table 16.2

Trust This term comes up frequently, but it has so many connotations (reliability, privacy, etc.) that discussions involving it are amorphous.

Chapter 17, Recommendation 6

The 2nd Bullet is now: “In the insurance realm, some US regions ban the use of certain types of data (e.g., zip codes or credit scores) in insurance pricing decisions. What is the legality of machine learning algorithms that do not use such data but behave in some ways as if they did?” The two other bullets were slightly reworded to increase parallelism.

Chapter 18, Recommendation 10

First sentence of the 2nd Bullet clarified: “Alternative business models where users purchase clear and vetted policies that govern their search, social network, or streaming application results, thereby obtaining recommendations that meet their long-term objectives.”

Concluding Thoughts

References

Several small changes, particularly to URLs.