Dr. villaveces

A dive into the last gun debate with Lott [LONG]

2019.04.16 09:36 4yolo8you A dive into the last gun debate with Lott [LONG]

The conversation with Lott got very technical, and the follow up will be even more so. It might be good and fun to summarize the disagreement and review some relevant texts, if only for myself.
Most of the in-text links are supplementary, leading to Wikipedia, source texts and a selection of graphs from papers.
general tl;dr: Don't expect that science is ready to confidently resolve all the gun policy questions, against or pro guns.

People

tl;dr: Lott is pro guns, Donohue is pro gun control.
Let's abbreviate the main rivaling hypotheses: MG,LC (More Guns, Less Crime), MG,MC (More Guns, More Crime), and also MG,IDK (Guns make no net difference). The last one is tricky – remember that we need to distinguish "the effect is really zero" from "the data is inconclusive", even though gun advocates may try to muddy this up. We'll also use some standard gun deregulation lingo like RTC (Right-To-Carry) and SYG (Stand Your Ground), and a couple more acronyms introduced later.
Even though it was a solid ambush, Destiny did well. Steven's conversation partner, John Lott, has been the leading voice of MG,LC for decades now. He's relatively renowned, so it's pretty cool that he came on stream. You can all read about his NRA funding etc. on Wikipedia, I want to skip the ad homs and focus on substance.
The opposite view, MG,MC has been argued by Donohue and some other researchers. Segments of the talk on stream centered around the last big study from Donohue's group, Right-to-Carry Laws and Violent Crime, which is unpublished but has been circulating and evolving in a preprint form. In the 2018-11 version it's 126 pages long, of which 42 is the main text, less if you skip graphs and tables. It's actually a pretty quick and easy read.

Background: causal inference is hard

tl;dr: Causal inference can be really difficult and confusing, especially with observational data.
Both causal and non-causal correlations can appear and disappear if you do any of multiple possible mistakes during:
You want to have your causal variable conditionally independent of all other possible causes, but not of its own effects. You can achieve this with some source of randomization (by design or by accident), or with statistical controlling (implicit stratification). If you mess up, your correlation estimates will capture external conditional probabilities that are unrelated to the causal effect. Researchers have been developing modern toolset to work with this, in big part within economics, around the Rubin causal model. In this approach, you essentially strive to set up an equivalent of randomized experiment, even in observational settings.
All such quasi-experimental methods rely on a leap of faith: faith that you managed to turn an observation into an intervention. To make it more credible, you typically conduct a series of robustness tests, that prove the results hold even when you shake things up a lot. Your goal should be Popperian falsification. The more severe the tests are, the more convincing your argument is.
About guns specifically: this research area is young and a mess. Effects of gun policies are complex; they work on many margins, in many different contexts and directions, which evidently makes them very challenging to isolate. This applies to Lott's pro-gun research as well. Even the randomized experiments should be taken with a grain of salt, quasi-experiments need a solid tablespoon or two, and raw (cross-sectional) comparisons an entire shaker. The NAS report from 2005 concluded that macro level studies of the time were hopeless, and recommended that researchers focus more on small studies of specific events and locales, instead of trying to answer the big questions in one bite. I don't know if much improvement happened since.

Background: p-values

tl;dr: Some basic statistical ideas are counterintuitive and widely abused.
Basic correct understanding of p-values is also very useful here. Non-significant estimates aren't evidence for non-existence of effects. They're INCONCLUSIVE. Having non-significant variables in the model can still improve its predictions. p-values alone tell you nothing about the importance or size of variable's statistical or causal relation to others, nor about the probability of either hypothesis being true. And statistical power) (primarily, sample size) isn't important because it helps your estimates become significant, but because it determines their precision. Statistically significant result can be wildly imprecise and unreliable under low power. In gun research, the sample size is often just a sad handful of states.
MG,IDK crowd likes to interpret non-significant results as support for a hard "net effect of guns on crime is actually zero, so why control guns", but on their own such results actually mean "this estimate is inconclusive". To answer their question, you'd have to choose a smallest effect size of interest and specifically test against it with adequate power.

Donohue's argument

tl;dr: Donohue provides results in support of MG,MC, of low-ish quality (the data is crap), but still a bit better than other studies.
Donohue, Aneja and Weber (DAW) preprint argues for MG,MC using two Rubin toolset methods: difference-in-differences (DiD) and synthetic control (SCM), introduced respectively in 1990s and 2003. In both you use standard statistics (typically, linear model based regressions, t-tests, F-tests etc.). What's new is how you pick, or construct, control groups to compare with the treatment/experimental ones. In DiD you use observational units that you think are similar enough (duh), and in SCM you solve a minimization problem to fit a combination of observational units that is most similar to experimental units.
DAW actually appear to do quite a bit of theoretical and quantitative legwork, and subject the results to what seems to be many robustness tests. They use both their variable set (aka DAW model), and a selection that Lott and Mustard used in past (aka LM model). The models differ in controlling for police employment, incarceration levels, and the granularity of demographic controls. Personally, I'm not sure which one is more appropriate, and would like to hear Lott's perspective on that. Depending on your priors, the general results are compatible with either MG,MC, or MG,IDK. The data is uncomfortably aggregate and noisy, and some analysis decisions may be seen as questionable, but AFAIK altogether it looks like a more severe test than whatever Lott offered previously and later for MG,LC.
Mechanisms that could make RTC laws increase crime
DAW propose a bunch of mechanisms for the proposed effects, and their heterogeneity, focusing on the following ones for the MG,MC:
We should note that many of them don't necessarily require that more gun permits are issued, just that they are easier to get. A large part of Lott's and other counters seems to be that in many cases post-RTC there was no increase in absolute number of new permits, and that very few legal owners are arrested for gun crimes.
DiD results
The paper begins with simple DiD estimates that replicate and extend some earlier works. Here, DAW show either MG,MC (but only violent crime), or MG,IDK (other crimes). When LM is used, we also get MG,MC (but only for other crimes) and MG,IDK (violent crimes). LM is argued to be less robust because it includes many more noisy variables, which blur the estimates. Broad pattern of data in the considered aggregation looks like this, so I guess it 's hard not to detect some effect.
SCM results
33 separate events (natural experiments) are studied, with appropriate quasi-controls for each. For example, Texas is compared to a synthetic Texas made out of 57.7% California, 32.6% Wisconsin and 9.7% of Nebraska. In that estimated, imaginary (counterfactual) Texas without RTC laws, the crime drop would be larger than what happened in reality (31% vs 19.7% decline). Now, this is a shaky way to estimate causal effects, but in this case we don't have anything better. You can find it more or less convincing, but the general picture is that RTC laws corresponded to a relative increase in crime.
The actual aggregate picture is almost a textbook SCM result, with zero difference between treatment and control groups before the treatment, and a departure of trends after. One thing that's troubling is large estimate uncertainty. DAW showcase two particularly well fitted models, Texas and Pennsylvania. However, many other states look much worse (see appendices). Graph of central tendencies looks like this.
Robustness tests
To DAW credit, they conduct many tests of robustness to random permutations of variables and comparison units. Lott repeated that many of the results are driven by Hawaii; in fact, they do hold even when Hawaii, NY and CA are excluded.
To sum up, the study goes far with the very limited data and suggests that RTC laws increase crime, maybe, or at least don't conclusively change it, and a decrease is pretty implausible. Main thing I take away from this is that the MG,LC hypothesis is weaker than alternatives under decently severe inspection.

Lott's and other counterarguments

tl;dr: Lott and others have some more and less convincing criticisms of DAW; what he really should do instead is conduct his own, better study.
Lott still commits to and defends MG,LC, arguing (e.g. here, here and here) that:
In all three replies mentioned above Lott's main research backing seems to be a single review paper, that is unfortunately not empirical. Unless he demonstrates his own testing that is even better than DAW I'm not sold on his arguments. The review paper criticizes, without going into satisfying levels of detail, an earlier DAW paper and a paper by Zimmerman et al. that argued for MG,IDK. I find it funny that Lott is so committed to MG,LC that he attacks even the MG,IDK results. Btw. Zimmerman's group utilizes stepwise regression (dropping "non-significant" variables), which is garbage, leads to overfitting, and should never be used.
There's also Kleck's comment on DAW. Gary Kleck is, similarly to Lott, a controversial figure, but unlike him a vocal proponent of hard MG,IDK. He formulates same criticisms as above, in a more detailed way. He expands on another valid point about unacceptably crappy variance of effects timing – it varies from one year to ten years from a RTC reform. Some of his arguments show what I think are misunderstanding of p-values, some share interesting knowledge about bad county-level data; it's all similarly unsatisfying though, because it doesn't provide solid results, punching us simply back to agnosticism.
Finally I'll note that there's a tangentially related back-and-forth that Donohue took part in (newer and older paper), plus a 4 page piece in Science, that give some more technical background on the research area, his studies, and responses to criticisms. He e.g. expands on the gun thefts scale as an important causal mechanism.

Some other recent quasi-experimental studies

tl;dr: Other studies are similarly underpowered and don't detect effect too often. Results compatible with Lott's MG;LC appear to be the very rarest though.
Mostly for illustration, I've found a couple of other recent papers that tried to do SCMs of gun control laws similar to DAW.
There's a 2016 analysis of effects of SYG laws covering 22 years and using SCM. It found mixed effects, and very limited evidence of increases of either gun deaths, murders or manslaughter, only in AL, FL and MI (out of 14 states). The increase is further undermined in gun lovers' eyes by the fact that these specific SYG reforms had a duty to retreat clause. Nowhere a decrease though.
There's also a 2018 paper using SCM, about RTC effects on murder; out of 8 states an increase was detected only in New Mexico. No decrease has been found. It has unimpressive robustness tests AFAIS.
Kleck and his team did another type of quasi-experimental paper themselves, of instrumental variables type. In that approach you use some external, hopefully unconfounded variable to help identify the experimental and control groups. Here they try to use Republican poll results and Vietnam veteran population as instrumental variables that identify second amendment fanboyism mostly independently from crime. I'm not sure how much I buy their approach, but the model provides general support for their MG,IDK, with a couple exceptions:
[Requiring] a license to possess a gun and bans on purchases of guns by alcoholics appear to reduce rates of both homicide and robbery. Weaker evidence suggests that bans on gun purchases by criminals and on possession by mentally ill persons may reduce assault rates, and that bans on gun purchase by criminals may also reduce robbery rates.
Kleck's crowd also did a meta-review and a couple more papers where they argued that nearly no macro study was done correctly, and that MG,IDK is the most solid hypothesis. What I'm suspicious about in their approach, is that – if I read correctly – they are very keen to control for gun ownership, which is an obvious mediator for gun control laws. The mediated effect is controlled away even though it may represent the main causal channel. They put forward an argument about this, but I don't buy it. At the very least I'd like to see the results from a model without that.

Santaella-Tenorio, Cerdá, Villaveces & Galea review

tl;dr: Some forms of gun control may have empirical backing.
There's a nice and pretty comprehensive meta-review from 2016 covering 130 empirical analyses from around the world, representative of the field, including 62 longitudinal studies. Unfortunately they didn't use meta-analytic tools that help to describe the potential publication bias, e.g. a funnel plot, and only evaluated study quality in broad terms.
The results are summarized in graphs two, three, and four. They aren't very satisfying either. Too many border suspiciously close to non-significance. Effects of some law types spread around what appears to be a mean of zero, as numerous on the MG,MC as MG,LC side. And the impressively large effects are usually even more suspect than the almost-non-significant. Still, there is at least some support in data for some forms of gun control.

Conclusion

My takeaway: at this point not much can be "debunked" or "proven" either way. Lott's MG,LC doesn't look good at all, but MG,MC isn't dandy either. We might as well talk about philosophy and ethics of gun laws and research instead. From the empirics, it's worth noting that at least some forms of gun control have some support in data.
With so many guns in circulation, margins at work, and ease of state border spillovers, local reforms isolated effect sizes might be inestimable with the resolution that currently available tools and data provide.
submitted by 4yolo8you to Destiny [link] [comments]


http://rodzice.org/