The perils of statistics in ufology

The following is taken from chapter 20 of Allan Hendry's 1979 book "The UFO Handbook" (Doubleday; ISBN-10: 0385143486; ISBN-13: 978-0385143486). Despite the book being over 40 years old, many of the issues and pitfalls detailed by Hendry are still incredibly relevant today and are routinely ignored or overlooked in ufology. This book chapter is a tough slog, but highly rewarding for anyone who is interested in statistics. I have tried to reproduce the style and formatting of the original text as closely as possible.

(You can download The UFO Handbook in its entirety here. For a cleaner scan, here is a substantially larger PDF file)

Tools: Statistics

It should come as a surprise to no one by now that the answer to the UFO phenomenon has not proffered itself on a silver platter. The problems are legion, not the least of which are the uncertainty of the value of each individual report and confusion over the nature of the reports collectively. UFO reports frequently succeed at being "provocative" but always fail at cohesively leading us anywhere. To cope with this frustration, two procedures have always suggested themselves:

1) Wait for the really "big" case. The sighting which (unlike all its predecessors) combines impeccable credentials with a self-revealing nature (e.g. the spaceship that lands on the White House lawn).

2) Examine the past for patterns. Collect all of the past sightings and scrutinize them as a body of data with statistical techniques.

The former "solution" assumes, first of all, that one UFO event is capable of representing all UFO events. What this approach really expresses is dissatisfaction with the myriad UFO reports we already possess as being of no probative value. Unfortunately, after thirty years (or centuries as some would have it) of precedent, and tens of thousands of sightings, no UFO case exists that isn't elusive in both quality and meaning.* Furthermore, there is no reason to believe that it will happen in the future. Thus, UFO researchers have commonly opted for the latter approach, with the understanding that any patterns in common to all (or many) UFO reports might well be as puzzling as the individual reports themselves. How well have these exercises succeeded at shedding a little more light on the nature of the UFO phenomenon?

There are many variables that have to be thoughtfully negotiated if any statistical effort is to be meaningful. Working against this caution is the universal temptation to extract statistical meaning out of a given body of data whose value far exceeds the "sum of the parts." Three such variables that seem to recur as problems in UFO statistics include:

1) Random sampling—roughly, the "fairness" of the collection mechanism involved.

2) Validity of the individual entries— whether discretion has been exercised in the selection of cases, with IFOs weeded out. As an alternative, has a weighting system been adopted that favors the more confidence-inspiring cases? Is it effective?

3) Uniformity of the data being compared—is an assumption being made that NLs are the same thing as CE II's? That is, if apples and oranges are being summed, is the statistical conclusion valid for both? Each of these considerations will play a role as we take a look at some of the more prominent attempts at compiling UFO statistics; first, however, let's give some thought to the ways UFO reports are collected, as these collections are the very heart of any statistical effort.

*If UFOs had a clear-cut explanation, you wouldn't have to turn to the private UFO sector to find out. Sufficient UFO reports in the public domain would let the whole world know.

Report Collection—Fair or Biased?

My own collection of 1,307 cases was confined to the United States so that I could contact the witnesses by phone; however, collection was truly nationwide in scope. A toll-free number distributed to police departments, airport towers, and planetaria in all the contiguous forty eight states ensured that. The distribution of all reports from August 15, 1976, to November 31, 1977, for IFOs and UFOs is shown below. Does this represent all of the UFO reports for that period in the United States? By no means; it doesn't even scratch the surface. Newsclipping collections for the same period of time reveal a whole body of UFO sightings that seldom repeated the ones I received. Other UFO organizations diverted away many more reports; most important of all, most people don't try to report their sightings. Only 13 per cent do, according to a survey taken for the Condon Report in the sixties by the Opinion Research Corporation. A more reasonable request would be, is mine a truly random sampling? Do states that turn in the most sighting reports truly have the most "UFOs" in proportion to the states with few total reports in this study? Sad to say, not even this inference can be meaningfully drawn. As mentioned before, the Center for UFO Studies is not the only civilian study group actively campaigning to get UFO reports. A group in the state of Washington, for example, is also distributing its phone number throughout the western states; our total for Washington, then, would have been far larger in its absence. Similarly, about sixteen other groups in the United States are also vying for UFO reports to be directed to their attention. The result is a skewed geographical distribution for report collection.

A given UFO reporting center becomes well publicized in its own home state. The artificially big turnout in the metropolitan Chicago area in my collection reflects this. In the map below, the total number of reports for March 1, 1977, through July 31, 1977, is shown by state. Note the high number for Illinois vs. the low number for Arizona. I've prepared this for comparison with a similar map published by Ground Saucer Watch, Inc., headquartered in Phoenix, Arizona, showing their national input for the same period of time. Note the large number of Arizona-based reports now† relative to the rest of the country and the negligible number of reports from Illinois. Thus, the individual UFO groups are not getting a portrayal of U.S. "hot areas" and "cold areas" as a function of uniformly distributed, random sampling.

†And they chose not to include the number of reports they received based on the Phoenix ad plane!

Is it possible to get an evenly balanced sample, then, by pooling all the known UFO cases from all collection mechanisms? The most ambitious attempt of this kind in the United States is the Center for UFO Studies' computerized UFO catalogue UFOCAT, containing sixty thousand individual UFO entries from all over the globe. This project was begun by Dr. David Saunders in 1967, using three thousand cases submitted by Jacques Vallee to the Condon Committee. Maintenance of the collection was assumed by the Center for UFO studies in 1975, which has developed the catalogue to its present status.

The sixty thousand cases, stored as entries of data on computer tape, appear in readable print-out form as lines of alphanumeric characters which represent quantized information about the sightings. The categories for which information is stored include: file numbers, the source of the report, date, time, location, state and county (or country), the numbers, ages, sexes, and names of the witnesses, the type and special features of the report, the number of objects seen, duration, size (estimated or angular), and latitude/longitude

One of the first problems with UFOCAT as a statistical tool becomes apparent here already. An important feature like "duration," for example, is only represented in about 12 per cent of the cases.

Furthermore, other telling considerations like terrain, weather, witness location, information content of the report, credibility, strangeness, colors, lights, shapes, structures, motions, formations, evidences, viewing conditions, witness data, and such are either spottily encoded or do not appear at all; so, depending on what you are hoping to glean from the computerized file, you may not have such an impressively large sample after all.

What about the sources of all these sixty thousand cases? Where did they all come from? Currently, the reports refer to over 250 specific sources including books, investigators, and magazines. A review of UFOCAT conducted in May 1977 showed the following breakdown:

Investigative files 31%

Clipping files 9%

UFO periodicals 19%

"Authors' " files 10%

Books 15%

Report catalogues 15%

Some of these sources are less than inspiring as arbiters of complete and accurate information on UFO events. "UFO periodicals" range from:

the publications of civilian UFO groups whose accounts of true UFO allegations suffer to varying degrees from missing negative information, superficial follow-up, and the obvious desire to fit sightings into extraordinary explanation schemes, to—
misleading news media assumptions (see the Wakefield, New Hampshire, incident in Chapter 17: "Tools: Press"), to—
complete fabrications of "UFO events" that never took place, published by newsstand UFO magazines with circulations of 100,000 (see Chapter 21: "Tools: UFOlogists and UFO Groups")!

Books are virtually always written from the polarized standpoints of UFO proponents and skeptics, and the same cases can be used as evidence by both camps. Report catalogues by their very nature severely curtail the amount of detail of the cases they contain, and disguise the distinction between cases that are mere anecdotes and cases that have been exhaustively studied. Clipping files of newspaper articles? See the Tools: Press chapter for the number of newspaper "UFOs" based on reports that turned out to be IFOs in my own study. Clearly, then, the quality of the entries in UFOCAT fluctuates greatly. Yet what about that 31 per cent "investigative files" category? That sounds like the best possible source. Why not simply write a program that eliminates every source in UFOCAT but that one to obtain more meaningful statistics? Prepare for a shock: 12,969 of the 18,299 UFO reports in that category are the Air Force's Project Blue Book cases. Dr. Hynek had the Center for UFO Studies workers re-examine all of the 13,000 cases collected by the Air Force throughout the summer of 1976. The Center wound up confirming what the Air Force had already established: only about 5 per cent of the cases in Project Blue Book are worthy of being considered genuine UFOs. That means that 95 per cent of the Air Force project's files are IFOs, yet they are all in UFOCAT. Indeed, the Blue Book reports make up 71 per cent of the important "investigative files" category, and 22 per cent of all the UFOC AT entries. So much for the value of UFOCAT's sources. To be sure, the UFOCAT coders know full well that UFOCAT contains IFOs; many of them are identified as such (according to the original source's judgment, not the coders') in a column reserved for "explanations." Still, I was once reassured that in a mixture of IFO and UFO cases, the UFO "patterns" would clearly "drop out " of the IFO "noise." This would be true if it weren't for the fact that IFO cases do not exhibit random, chaotic characteristics relative to the UFO reports; indeed, most of the IFOs are caused by only four kinds of stimuli: stars, ad planes, aircraft, and meteors, which possess patterns of shape, duration, time of appearance, etc. at least as clear as those of the random variety of UFO appearances and behaviors. Furthermore, when IFOs exceed UFOs by a ratio of nine to one, the IFOs present in any UFOCAT-based study will swamp out any UFO "patterns" with "noise" or "patterns" of their own. All of this assumes, of course, that the observable characteristics of IFOS and UFOs are even being perceived, reported, and published accurately—a big assumption, judging by the conclusion of the IFO Message chapter.

What about the homogeneity of the sampling in UFOCAT? Are all areas being proportionately represented (if not totally)? Certainly not for the world at large, since North America commanded 59 per cent of the reports in UFOCAT in May 1977 compared to 21 per cent for Europe, 7 per cent for South America, and only 3 per cent and 1 per cent for all of Asia and Africa respectively. This doesn't mean that more UFO events took place in the United States and Canada than in other places in the world; it just reflects the ease of UFO researchers in America in getting American reports.

What about the randomness of sampling in the United States, then? Aside from the Center for UFO Studies in Illinois, and the UFO Reporting Center in Washington, which also distributes a phone number throughout the western states, the other UFO groups get their input primarily from field investigators. The number of reports monitored in a given area under this system, to a large extent, is up to the personal zeal of the various investigators. Furthermore, the number of investigative file reports (beyond the Blue Book sightings) is overwhelmed by the influence of reports from the journals of these groups; these only portray the fraction of the total input received by the civilian groups and they are not necessarily from the different groups' home states. One last consideration, based on this graph released by the Center for UFO Studies of the increase in high-strangeness cases in recent years. The average share of UFOCAT for all years by high-strangeness cases is about 14 per cent. Yet this graph shows high-strangeness shares from 30 to 60 per cent, beyond the "Blue Book years" ending in 1969. Obviously, half of the cases in 1975 and 1976 were not Close Encounters. The explanation is that UFOCAT coders have placed an emphasis on the "more interesting" Close Encounters in the latter years. Such a change in admission policy in midstream obviously violates the homogeneity required for valid conclusions about, say, total reports per year and the proportion of NLs to CE III's.

Whither UFOCAT?

As a bibliography of raw UFO reports, UFOCAT is without peer. Specialized catalogues have been extracted with ease from this system for individuals interested in certain kinds of UFO reports, including: —reports from 1973 only —humanoid reports only —Ohio reports only — CE II EM cases only —abduction reports only —sightings from aircraft only and more. The possibilities are limited only by the number of codes that exist for types, subtypes, and characteristics. From here, however, the printed output must be treated as only a reference guide to the original sources for the details. Otherwise, the distinction between poorly investigated reports and exhaustively studied sightings is completely lost.

UFOCAT cannot generally be used as a statistical tool, then, since it violates the three precepts set out in the beginning of this chapter. That hasn't stopped people from trying it, though, even researchers connected with the Center for UFO Studies. With the Center's co-operation, Edmund Scientific Company has offered the public a slide set of statistics based on total UFOCAT output. The graphs depict a variety of distributions, including the number of UFO cases vs. time, total UFOCAT cases by year, total Air Force cases by year, Air Force "unknowns" by year, low- and high-strangeness cases by year, distribution of reports by month, and Close Encounter cases by hour of the day. Other graphs show case durations, geographical distributions (CE III's by state), and witness statistics.

The immediate problem that commends itself is that the distributions are not exclusively for genuine UFOs. The total reserves of UFOCAT were employed to create the graphs; even that 95 per cent of the thirteen thousand Air Force cases which are only IFOs were called "UFOs of low strangeness!" The slide set's narrative states: "If we limit our print-out to what we call the high-strangeness cases, the Close Encounters, we greatly reduce the probable number of cases that can be ascribed to [IFOs] . . . not eliminate them, just reduce them." In the rigorously controlled study in this book, where I dealt with and evaluated all of the 1,307 cases, there were 71 candidates for Close Encounter status. Only 16, or 23 per cent, were deemed worthy of the label "UFO"; the remaining 77 per cent were matched to IFO sources. Thus, I cannot agree that the high-strangeness nature of the initial details of a UFO allegation are sufficient to guarantee a majority of genuine "unknowns." What does it mean to the statistical results when so many IFOs and dubious cases are mixed in with the worthy specimens? We'll include some of the Center/UFOCAT statistics in the context of the statistical claims that follow.

Duration—UFOs vs. IFOs

The duration of a UFO event has always been a singularly powerful parameter in UFO/IFO judgments . . . but only at the extremes. For example, in this controlled study, of the 121 UFO and IFO reports submitted with total viewing times from zero to ten seconds, 70 per cent were caused by meteors (including re-entries).

Of the 164 UFO and IFO reports greater than one hour in duration, 78 per cent were due to astronomical targets (stars, planets, the moon, and moondogs). There have been attempts, however, to extend the usefulness of duration to demonstrate a difference in the over-all characters of UFO and IFO reports. One such attempt was privately printed in May 1974, by Dr. Claude Poher, an astrophysicist in France's National Center for Space Studies. The graph below is adopted from one prepared by Dr. Poher to show the difference in character of a body of 508 cases of pre-screened UFOs (the hillshaped curve) and 350 cases of prescreened IFOs (the valley-shaped curve). The inference is that UFOs are not "misperceived IFOs," but something different yet coherent in character.

First, I present a similar graph of the reported durations of the UFOs and IFOs in my own study. It fails to confirm the distinctive difference in character established by Poher's graph for UFOs and IFOs as bodies of data. The UFO plot bears resemblance to the former graph, but the IFO graph is simply not a study in contrast in my findings the way it is in Poher's. The only feature of note in my own plot is that the UFOs are emphasized below a ten-minute threshold, whereas above it, the IFOs are emphasized. Another attempt at comparing the selection of UFOs with IFOs for duration was conducted by the Battelle Memorial Institute utilizing the first 4.5 years of the Air Force data collection. This is shown below:

Once again, a failure by an outside group at replicating a diametrically opposed nature for the UFO/IFO duration profile. In all three situations, UFO durations peaked in the same vicinity (five minutes or so) and sloped off in either direction. The IFO profiles behaved differently each time. This variation from data sample to data sample, despite their ample sizes, is not uncommon throughout the UFO literature. I submit this is caused (at least partly) by the excessively broad diversity of UFO (and IFO) stimuli. When stars, planes, balloons, meteors, flares, birds, and what-have-you are all jammed under the same statistical roof, the outcome is bound to be vulnerable as different samples are collected at different times in different places.

Second, and more importantly, I regard it as a mistake to expect "duration" to stand on its own feet. Consider all the conditions that affect duration that have to be checked out on a case-by-case basis:

—Did the witness start watching the object from the "start"?

—Did the witness leave the event without watching it to its completion? (This is disappointingly common.)

—Did the witness have a wide-open view of the sky (horizon to horizon)? Was it partially restricted by trees, buildings, or clouds? Was his view severely restricted by, say, looking out a window?

— Was the witness stationary, or did he try chasing the object in a car? (Not uncommon.)

Clearly, with the exception of extremely short and long durations, the amount of time an aerial phenomenon is viewed is more a characteristic of the witness than the "UFO" candidate. With all of these variables artificially affecting the durations, even in the IFOs, it is meaningless to sum all unqualified duration figures under one statistical umbrella. I have ad plane durations that range from five seconds to one hour, stars seen from five minutes to seven hours, balloons from ten minutes to four hours, and all intervals in between, due to changes in viewing condition—even watching only one kind of stimulus. Members of a Canadian UFO group in Ontario fixed themselves in one place at Lake Ontario and watched planes only from the moment they became visible to the point of their disappearance. Fifteen separate durations were noted that ranged from fifteen seconds to eight minutes, seventeen seconds.

Conclusions? "Duration" is a powerful feature of identity when it refers to extremely short and long events, but is otherwise mostly a reflection of the witness's behavior during the event, coupled with the fluctuating behavior of the objects watched.

UFOs vs. Time—Hours of the Day Distribution

There is one statistical conclusion on which all researchers seem to have reached a consensus: the distribution of UFO reports by time of day. This graph of Close Encounter reports was published by Drs. Claude Poher and Jacques Vallee.

Notice the nocturnal emphasis of the reports; the primary peak occurs around 9 P.M. local time, with a secondary peak around 3 A.M. Good agreement between this graph and other similar attempts has occurred.

Compare it with the graph of all UFOCAT entries, of which a small minority are Close Encounters. Indeed, even my own collection of 113 UFOs yields the same profile. All of this has led UFO researchers to speculate on the nature of UFOs such that they would give rise to these characteristics. Poher and Vallee, for example, offered this graph for comparison; it shows the percentage of the working population of several countries out of their homes, and in a position to be a UFO witness.

Note how UFO reports peak during the hours when most people are indoors, even asleep. Poher and Vallee than artificially compensate for the lack of witnesses by adjusting each hour for the loss of potential witnesses.

The result is shown in the graph above.

The 3 A.M. peak of Close Encounters has been boosted into a single major peak, the implication being that if everyone were outdoors at all hours, we would have fourteen times the number of Close Encounter reports (or any UFO type, apparently) than we possess now. The Poher/Vallee file of two thousand CEs would increase to twenty-eight thousand, according to their statistics.

There is just one hitch.

What would a distribution graph of IFO times look like, one using my 1,158 IFOs, for instance?

The conclusion is obvious: a time profile of stars, planes, meteors, balloons, and what-have-you is identical to the "characteristic" curve for UFOs. Thus, time distributions for UFOs and IFOs must simply be a reflection of the witnesses' viewing behavior rather than the behavior of the UFOs themselves—a more accurate index of actual witness availability than the idealized statistics of Poher/Vallee would suggest.

Otherwise, it would be equally valid to adjust the IFO time plot like the UFO plot, and have it come out exactly the same. That is, the IFOs would all be extremely active at 3 A.M., and just think . . . there are nine times as many of them as there are UFOs. Does this mean that the stars and bright planets are most apparent at 3 A.M.? The airplanes and helicopters? The advertising planes? The meteors? The balloons? Furthermore, if UFOs (and IFOs) were undergoing a giant nightly peak at 3 A.M., why aren't they being observed by a nation full of policemen out on routine patrol while the rest of us are asleep? Remember, the Center for UFO Studies maintained a police hot line and they represented our second largest witness occupation. And they kept us informed of sights as simple as twinkling stars at all hours. For example, when a spectacular meteor passed over California around 3:20 A.M., visible from San Francisco to Los Angeles, I had access to over two hundred known witnesses representing law enforcement agencies, FAA facilities, and Air Force bases. The same thing happened again in California on June 18 at 2 A.M. The important consideration here is not the large number of witnesses so much as the excellent representation of all of the California counties by law enforcement agencies calling in to the state's Office of Emergency Services, who in turn reported to the Center. If a meteor was visible over a given county, someone saw it and reported it ... at 2 and 3 A.M.! Hence, it is a mistake to expect to exaggerate the early morning UFO sightings, or to expect "time distributions" to display anything other than witness characteristics.

Bear in mind two other characteristics regarding the popular use of local time in time distributions like these that you seldom see considered in the literature:

STANDARD TIME VS. DAYLIGHT SAVING TIME

What does it mean when a report collection reflects two roughly equal reporting periods where the "local times" of the reports obtained could be either in "standard time" in the winter or "daylight saving time" in the summer? Stars, bright planets, birds, and other natural phenomena don't set their clocks back and forth, so why not adopt standard time only for, say, the IFO distributions? Because man-made systems (like plane schedules) do adjust; so while it is obviously inadvisable to perform this for IFOs since they are attributable to a multiplicity of stimuli, it might also be inadvisable to do the same for UFOs.

UFOs. TIME ZONES

When brief, high-altitude phenomena like spectacular meteors and re-entries were reported by many witnesses in two different time zones, it raised a problem . . . statistically. Some of the witnesses would clock it at 9 P.M. local time while the next time zone saw it at 10 P.M. When plotting reports as a function of time, this manifested itself as two separate events "one hour apart." Should a universal time (like Greenwich mean time) be adopted in all reports, then? Consider how these alterations would change the statistical profile of time plots. Where does one draw the line?

UFOs vs. Time—Days of the Week

UFO researcher John Keel took an early stab at plotting reports by days of the week. As a result, he was compelled to consider a "Wednesday effect" due to the preponderance of reports noted in his sample for that day.

David Saunders prepared a daily distribution based on a large sample of UFOCAT reports which yielded different results. Yet compare both Keel's and Saunders' results with this one prepared by a British group called NUFON (Northern UFO Network) for 128 reports in 1975. NUFON notes that only 10 per cent of these reports were assessed to be "unknowns," but then UFOCAT is a mixed UFO/IFO bag too. How about the 113 UFOs in my own study? They would be distributed as shown.

Of course, all four attempts yielded different results. One obvious difference that exists among the different attempts is the varying mixtures of identifiable objects; yet even if all of the samples were completely pre-screened for worthy UFOs only, the huge variety of appearances and behaviors provided by even these reports would still serve to explain why there is so much sample-to-sample variation.

It also points out the fallacy in applying statistics formulae to prove that these results are non-random due to the size of a sample; those formulae are only valid if the same "objects" are being sampled, not the multiplicity of UFO "types" that are being collected under one roof.

UFOs vs. Time-By Month

The graphs below show both the Air Force Project Blue Book cases (both UFO and IFO) and my own cases distributed by month. The trends these graphs hold in common are typical of other U.S. samples. The uniformity of this feature from sample to sample is probably due to the fact that it reflects the changing number of potential witnesses from cold weather to warm.

Note that there can be twice as many reports in July as there are in December. What about the reports judged to be UFOs? The graphs below show both the UFOs in my study as well as the Air Force unknowns plotted by month. I would suggest taking the curve more seriously that lacks the influence of the exceptional year for summer unknowns, 1952, in the Air Force curve.

Despite the monthly fluctuations in total reports in both cases, the residual unknowns seem to show no real trends, and remain reasonably even throughout the year.

UFOs vs. Time—Yearly

The graph below depicts the fluctuation in total Air Force Blue Book reports for each year of its operation. The important feature of this distribution is its non-uniformity; the years 1952, 1957, and 1966 stand out as peak years relative to all the rest.

These sudden "waves" of sightings have always been of great interest to UFOlogists; the most provocative inquiry into the nature of UFO waves, however, is the work of Dr. David Saunders, mentioned previously as the originator of UFOCAT. Utilizing that bibliography as a statistical tool, Dr. Saunders sought to find a relationship among UFO flaps as they appeared in the collection. The next graph shows the total contents of UFOCAT by year; Saunders utilized the reports stored in UFOCAT in search of all report waves, their peak dates, the slope rates on either side of the peaks, and their geographic location.

Saunders then needed to establish which waves were genuine, as opposed to false waves artificially stimulated by a burst of publicity surrounding, say, one significant UFO sighting. Unfortunately, this was not done by examining the individual cases that made up these localized flaps but by taking the statistical short cut of studying the shape of the wave on either side of the peak date. If the wave rose sharply up to the peak, and trailed off slowly, it was deemed reasonable to expect that the wave was stimulated by publicity surrounding one big case, with excited IFO sighters jumping on the bandwagon.

If the wave rose slowly up to a peak, then dropped off suddenly (a negatively skewed wave), it passed the test. Six positively skewed waves (from 1950 to 1973) were thus dismissed as being caused by the publicity surrounding news releases on Major Donald Keyhoe, Sputnik II, the Socorro case, swamp gas, and the Pascagoula fishermen case. Five negatively skewed waves were found that were related in two ways:

1) They were separated by sixty-one month intervals.

2) The geographical center moved 30° of longitude eastward with each wave around the globe.

The five waves included:

July 8, 1947—western United States

August 3, 1952—eastern United States

August 21, 1957—South America

October 24, 1967—England

November 1972—South Africa

Note that a 1962 wave is missing. This is permitted since 30° of longitude eastward of South America puts you in the mid-Atlantic where there would be no witnesses. Saunders then concludes that an annual relation between longitude and month of the year exists. While not willing to venture a theory, he defends the significance of his discovery by establishing that "it is difficult to conceive any hypothesis purely in terms of UFO reporting mechanisms." Permit me to suggest a few.

It is a commonly overlooked, yet undeniable, fact that UFO waves are not an increase in sightings (be they UFO or IFOs), but in reports. The distinction is very important, because the nature of any wave, positive or negative, can reflect either an increased number of actual objects in the skies, or a sudden increase in the efficiency of a reporting mechanism, usually in the form of enthusiasm by one researcher in a given area. Take the first negative wave employed by Saunders in his progression in the western states in 1947. Note the worldwide peak in the UFOCAT case-by-year chart. But where did this wave come from? The Air Force didn't observe one; they logged only 122 reports for the whole nation. NICAP, a civilian UFO organization, logged only 20 reports for 1947. The answer is that UFO researcher Ted Bloecher, touring the country as a dancer, searched 142 newspapers in 93 cities in 49 states, two Canadian provinces, and the District of Columbia and turned up 853 reports. This was an extremely rigorous and more efficient way of obtaining UFO reports in that early year than waiting for the reports to come to the UFO organizations. Was an equally rigorous search of newspapers exercised for 1948, 1949, 1950, 1951, 1952, and so on, such that one could fairly compare the difference between 1947 and other succeeding years? No; hence, because of a singularly zealous effort at digging up latent UFO reports, not employed in succeeding years, Saunders and others call 1947 a wave year. It should come as no surprise that the emphasis of press attention in 1947 took place in the western states since press coverage of the famous Kenneth Arnold sighting in the state of Washington kicked off the modern UFO era.

Another example of the effect of reporting mechanisms on localized waves: UFO researcher Ann Druffel once hoped to locate additional witnesses to a February 1977 UFO sighting made by two helicopter pilots flying over Glendale, California. The better than average reporting network in that area failed to turn up any on its own, so Druffel altered its efficiency temporarily by placing an ad in the Montrose Ledger, asking if anyone else saw this specific sighting. The response (for that specific area) was "nothing less than astounding" according to Druffel; eighteen responses from Glendale-La Crescenta citizens. Only one of these took place on the night in question, with most of the rest spread throughout January and February. To quote Druffel's conclusions: "it seems to add credence to the hypothesis that UFO activity might be constant anywhere, any time, but unknown to UFOlogists" . . . until an atypically strenuous effort is made to dig it up, followed by a relaxation back to the previous operations. Thus, the wave of reports—with no assurance that it is a real increase in objects in the sky, or just a better effort at uncovering more reports than are usually sampled.

The last entry in South Africa wasn't any kind of major flap at all. But before it happened, Saunders was already looking for the data. To quote Saunders: "The date and longitude of the 1972 wave in this series were predicted in advance." He prides himself on not having announced what they were, but he himself went searching for a flap along the whole longitude running through Europe and Africa until he found one, no matter how small. Indeed, from now on, dates and longitudes can be predicted in advance, and like Ann Druffel's experience, some kind of wave can be found (excusing all quiet areas as "too remote," as in the phantom 1972 "Atlantic wave"). That rather dooms the unbiased waves available to the ones preceding the anticipated South African one. Even if we generously include the artificially reinforced wave of 1947, that is only four waves separated by three intervals of time and displacement—not a lot of data points on which to establish a relationship forever.

Measuring the actual rise/fall slopes of the waves mentioned by Saunders by counting the cases currently in UFOCAT (plus and minus equal durations of three to five weeks, as set out by Saunders) reveals that a negative/positive decision in each wave is not so clear-cut as the simplified drawings would suggest. They don't rise straight up or fall straight down, but require more careful measurements and a closer judgment. Small changes in the known investigative efforts at uncovering cases on either side of the peak date could change the decision from negative to positive. Indeed, removing one source out of 250 (Ted Bloecher's work) would wipe out one of the five waves. Furthermore, the real content of all of the waves, as judged on a case-by-case basis, contains mostly IFOs. Concerning the 1952 wave in the sequence, for example: 1952 was determined by the Air Force to be the year with the greatest number of unknowns in their experience. An independent check of these reports was effected earlier by the Center for UFO Studies, which confirmed this; yet four out of five of the 1952 reports were still IFOs. Eighty per cent IFOs is not far from the typical percentage figures in non-exceptional years; such a huge IFO turnout would be expected in a "flap" year generated by media publicity. Yet Saunders' system had supposedly weeded these out in advance.

Saunders notes that "important, negatively skewed waves have occurred that do not conform to the sixty-one-month cycle." The five waves in his ideal progression were not the only waves uncovered in his search with negatively skewed slopes; the important 1954 major wave in France as well as 1965 wave in the midwestern United States were negative waves, but did not conform to the sixty-one-month separations. The presence of two waves out of seven that supposedly belong as much as the other five (by virtue of Saunders' acceptance criteria), but don't, helps kick the numerological magic right out of his progression. So Saunders briefly lets the five-year relationship drop in favor of an annual relationship between date and longitude for UFO flaps. Now those two maverick waves happily fit this relationship. But with data points restricted to half of one annual cycle, how can it be assumed what the rest of the relationship is? What if it is not linear? What if it varies? Why do we have to wait years to fill in the gaps of a yearly global sweep? Most importantly, the same problems of data validity, randomness, and uniformity apply to this hypothesis as well. No wonder Saunders offers "no affirmative explanation" for his findings when he floats two relationships, five-year jumps, plus annual sweeps at the same time.

Thus, we have a situation where an attractive conclusion is wrong because of the conditions

that led to it, plus the existence of "valid waves" that don't fit it. The best analogy I can think of is one from astronomy: Bode's law. The column on the left is a series of numbers generated by taking the series: 0, 3, 6, 12, 24, etc. (doubled each time), adding 4 to each number, and then dividing by 10. That's all. Now compare these numbers with the ones on the right, which represent the distances of the planets from the sun (using the earth's distance as "1"). The seeming agreement is astonishing; however, no astronomer would hold Bode's law to be a real "law" because it breaks down in exactly the same was as "Saunder's law" does. There is a phantom planet between Mars and Jupiter in Bode's law just as there is a phantom wave in 1962 in the mid-Atlantic. Also, two numbers in ten in Bode's progression fail to work out, just like the two waves out of seven in Saunder's findings. The results of both efforts is thus the same: an initially attractive conclusion, but ultimately invalid.

Up until now, alternative theories for UFO waves advanced by skeptics have depended upon oversimplified contrivances like bursts of media publicity wherein society at large invents a flurry of false sightings; while that may work out in general, it is not hard for UFO proponents to defuse the universality of this write-off scheme:

—During a popular radio program on UFOs in France, a national "skywatch" was declared. Listeners were encouraged to go outside on the night of March 23, 1974, and watch for UFOs. Not one of the tens of thousands of citizens produced a report

—Edward Ruppelt, one of the heads of Blue Book, stated in his book that he failed to statistically locate any relationship between media publicity and UFO waves.

—Author Joseph Blake prepared a chart comparing the annual numbers of UFO articles listed in the Reader's Guide to Periodical Literature and annual numbers of Air Force Blue Book cases. Result? No correlation.

A more plausible alternative explanation for waves (and this is only an alternative) can be developed with an eye toward Ann Druffel's observation that UFO activity might be constant everywhere, coupled with the Condon Report survey which revealed that only 13 per cent of all UFO sighters tried to report events.

What if 100 per cent of the people who thought they had seen a UFO for all dates and places reported their sightings, and it turned out that there was an even level of UFO and IFO activity seen by, say, the 200 million citizens in America during the Blue Book years—a level equal at least to the amplitude of the highest peak year. That would mean that the reason why most years turned in a "residual level" of cases was because 87 per cent of the sighters didn't report their sightings. When a peak year occurs, we are really getting a better taste of the ongoing UFO "activity," which can be brought about by different situations: a breakdown in the resistance of the public to report or an increase in the efficiency of the collection mechanism, like Ann Druffel's newspaper article or Ted Bloecher's extensive newspaper search for one year only. The reader should not be impressed that there must be a gigantic number of UFOs flooding the skies at any time, though, since examination of the content of flap years reveals that the proportion of IFOs to UFOs is still the same as in non-flap years (further supporting this hypothetical model). The 20 per cent UFO rate in the flap of 1952 yields slightly more than three hundred Air Force unknowns. Three hundred real UFOs in 365 days for the whole country (although most of them were in the summer months). The total Air Force yield for the years 1957 and 1966 were equally high as 1952, but the Air Force percentages of unknowns (doublechecked by the Center for UFO Studies) were only 1.2 per cent and 2.8 per cent.

Is this wave hypothesis descriptive of reality? Can it be proved or supported to be a better "fit" than the premise that waves are due to an increase in events? No . . . but so long as we collect only UFO reports (with sporadic sampling guaranteed), we'll never be able to tell the difference . . . and that is where we lose track of "science."

Proportions of the Six UFO Types

"How many CE III's are there relative to NLs and DDs?"—a commonly asked question but not an easy one to answer. Obviously, no one can give total numbers for Hynek's six UFO categories, so what about percentages in any one collection? Here we run into the problem again of different types of collecting mechanisms yielding different results. When the Air Force collected thirteen thousand U.S. cases over a period of twenty-two years, they came up with only twelve CE III reports. I don't mean that they ruled out hundreds of CE III candidates; the headquarters at Wright Patterson Air Force Base simply never obtained any. One reason is that the Air Force project obtained its input through intermediaries— the other Air Force bases around the country. These bases quite likely discarded anything having to do with UFO occupants, preventing Blue Book from getting a chance to file them. To demonstrate the differences that can result, here is the breakdown using the Hynek classification scheme of the 113 worthy UFOs in my own study and 587 Air Force UFOs selected by the Center for UFO Studies from the now-available Blue Book files (roughly equivalent to those cases the Air Force judged "unknowns"):

AIR FORCE MY STUDY

587 UFOS 113 CASES

NL 38% 70%

DD 42% 16%

RV 5% 0%

CE I 7% 8%

CE II 5% 2%

CE III 1% 4%

Another recently published collection of 1,242 reports gleaned from such diverse sources as the books of Charles Fort, newspaper clippings, and Fate magazine yielded a 6.1 per cent occurrence of CE III reports. Remember that UFO publications favor the more interesting Close Encounters and edit out the lesser sightings. Naturally, the judgment of the collectors has a lot to do with the report percentages indicated. In my collection, only two out of seven CE III candidates were removed from UFO status. A UFO skeptic would find reason to discard them all, while proponents would try their best to save as many as possible. The reader should not be led to feel that the small percentages of Close Encounter cases means that there is an insignificant number of such reports. UFO researcher Ted Bloecher has made a special point of collecting nothing but CE III reports. Unlike my own collection procedures, Bloecher includes non-current cases, foreign cases, reports of strange creatures without UFOs present, even incidents of disembodied voices—and from all sources. As a consequence, he has collected over 1,500 CE III reports, most of which include the sighting of both UFO and UFOnaut.‡ If an equal effort is made to obtain all (uninteresting) Nocturnal Lights and Daylight Discs and such, however, you still wind up with the same percentages.

Finally, one of the biggest concerns in determining the percentages of the different UFO types has to do with variable reluctances in reporting them. There is a minimal threat in telling others about an orange light moving erratically about the night sky, or even a silver oval in the daytime sky.

‡Similarly, researcher Ted Phillips has maintained a catalogue of over a thousand physical trace cases from all sources, and all levels of investigation and quality. Researcher Mark Rodeghier has found over eight hundred cases of electromagnetic interference claims.

Now, how quickly would you pick up the phone to tell the police that a metallic-looking craft with large-headed occupants has landed on your property? Or even that they had abducted you? Or that a depressed ring in a field was caused by a UFO? This gives you some feel for the well-known observation that the Close Encounter cases can require years for the witnesses to build up enough nerve to reveal their experiences. This is not generally the case with the distant, less embarrassing (or threatening) UFO manifestations. Consequently, it may not be fair to determine the percentages of UFO types for a given period until the years have gone by . . . but then, is it fair to mix in CE III's that were investigated when memories were fresh with accounts that are years old and nearly impossible to double-check? In terms of the IFOs, you can resolve even a mutilated description of an ad plane by calling the company now . . . you can't do it years later.

Law of UFO Occurrence

Is there universal agreement on the type of environment in which most UFO witnesses find themselves? Compare these conclusions culled from the UFO literature:

My own collection? I spoke to all of my witnesses firsthand. I didn't have to plot maps, refer to the average characteristics of counties, or employ other rough techniques to judge the nature of the sighting environment. I merely asked the witnesses to characterize it for themselves. The use of counties fails to provide enough resolution for this purpose; even Cook County in Illinois which contains all of Chicago also contains forest preserves! The results of my own direct inquiries run as follows: UFO reports into distant sightings and Close Encounters:

ALL IFOS ALL UFOS

URBAN (densely populated) 25% 8 %

SUBURBAN (residential area) 46% 33%

RURAL (sparsely populated) 29% 59%

Note the decided emphasis of the UFO reports to occur in non-populated areas relative to the IFO control group. Let's go one step further and separate the 113 UFO reports into distant sightings and Close Encounters:

97 NON-CES 16 CE I, II, AND III’S

URBAN (densely populated) 25% 8 %

SUBURBAN (residential area) 46% 33%

RURAL (sparsely populated) 29% 59%

Having corroborated the notion that the more exotic reports originate in less populated regions quite specifically, we can afford the luxury of plotting the IFO and UFO reports on a map of the United States: The IFOs cluster in areas that suggest a dependence on the more populated areas of the United States, while the UFO reports exhibit a somewhat less constrained distribution. All of this fits in with the apparent consensus of UFOlogists described earlier. It does not serve to bolster the automatic interpretation of most UFOlogists, however, that an intelligently guided avoidance principle is in effect.

Remember, we are dealing in reports, not events. It is not difficult to imagine a situation where IFOs, distantly seen UFOs, and exotic Close Encounters are witnessed equally in all areas, regardless of population density, but the reluctance of people to report a high-strangeness sighting reaches an inhibitive level in urban settings. Furthermore, UFOlogists are likely to dismiss urban Close Encounters out of hand. A CE III case with unearthly occupants alleged to have been seen in the desert is "safe"; the absence of additional witnesses was "due to the non-populated surroundings." Yet take that exact same event and stage it in downtown Los Angeles, and the witness is concluded to be unreliable, for surely there would have been additional witnesses in a situation like that. Preference for Special Features It doesn't require familiarity with the UFO literature to know that UFOs are "supposed" to manifest themselves in the vicinity of certain man-made and geographical features. You've probably noted statements proffered by UFO researchers that pick on power lines as being closely associated with UFO sightings. A detailed examination of the UFO literature, however, reveals any number of correlations between UFO sightings and some other attractive agent:

—power lines and transformers

—Strategic Air Command bases, other military installations

—nuclear power plants

—schoolyards

—mines

—quarries

—bodies of water

—geological fault lines

—longitudes synchronized with the stellar background

—non-populated areas

—populated areas

—coastlines

Directionless? Yet each effort is often accompanied by sightings plotted on maps against areas of interest as well as theories to account for the individual patterns: power lines mean UFOs are either "stealing kilowatts" or even that the electromagnetic effects of the lines are "causing" the UFOs to become visible. Military installations are equated to UFO surveillance of strategic areas . . . and so on. In truth, UFOs have been reported to appear over any kind of feature in the world; there have been accounts of UFOs purportedly flying over the White House, the United Nations, and the Eiffel Tower. There is even a CE III account in the literature with multiple witnesses that took place just outside of Manhattan. Yet although UFOs appear in all areas at random, the ones that occur near sites of interest like nuclear power plants or Air Force bases become conspicuous.* Searching for meaning in an elusive phenomenon, researchers hope that the connection may be more than coincidence and research the literature for other examples of the same nature, without adopting a general approach to all reports.

Furthermore, the various UFO groups guarantee the continuance of patterns by limiting the number of multiple choices in questionnaires about sighting environments. Look at these sample questions culled from the major UFO groups' questionnaires, and ask yourself how predominant such features as "power lines" would be if the questionnaire included other equally distributed features like "roads, trees, supermarkets, bowling alleys, traffic signs, clear days, two-story residences, fields," etc. for comparison? These features will never be included because they are not as provocative. Investigator forms cannot show you what else was in the area because there is no interest in including "irrelevant" features. To show how silly this kind of prejudgment can get: a UFO researcher (who shall go nameless) once called the Center for UFO Studies because there was a power failure at his place of business, and he was curious to find out if there were any UFOs reported in his area.

*Remember, military bases have sentinels and twenty-four-hour logged records of daily events, unlike most of the rest of the world. Consequently, it can seem that they have more UFOs "hovering nearby." Similarly, police are out patrolling while the rest of us are asleep.

We've already noted in the IFO Message chapter that both UFO and IFO witnesses exhibited preconceptions about UFO appearance and behavior despite their stated lack of reading on the subject. This also applies to statements that witnesses made about sightings that turned out to be IFOs. I never led witnesses with questions about these items. Examples include:

Case 353—"hovered over power line" = helicopter in Washington

Case 397—"trying to use the radio towers for some purpose" = ad plane, seen again the next night

Case 468—"unusual movement with respect to power lines" = star viewed for two and one-half hours for three nights

Case 519—"no power lines to attract it" = Venus, viewed for forty-five minutes

Case 524—"power lines present"= prank balloon

Case 551—"near TV tower" = star, seen for one hour

These are significant because I never asked the witnesses if they saw "powerlines" or "TV towers"; they volunteered these observations. Yet most of them claim to be ignorant of UFO literature. If the reader gathers that I see serious problems in the attempt to link UFO sightings to any special feature, he'll agree that it is with good cause.

Project Blue Book's Special Report 14

No discussion of statistical applications to UFO reports would be complete without mention of the most conspicuous large-scale effort at examining the distinction between UFOs and IFOs, Project Blue Book's Special Report 14. Commissioned by the Air Force in 1951 to the Battelle Memorial Institute, Special Report 14 was intended by Captain Edward Ruppelt, then head of the Air Force project, to yield more information about UFO reports, but certainly not to solve the mystery. Instead, the controversial study was to serve as both the cornerstone for the Air Force's eventual policy of skepticism as well as for UFO proponents who exploit it to show the Air Force reaching a negative conclusion unsupported by their own statistical results.

The Battelle Institute worked with 2,199 reports collected by the Air Force between June 1, 1947, and December 31, 1952. They classified these, based on the available information, into various IFO categories (e.g. astronomical, birds, balloons, etc.), a psychological category, an "unknown" category, and significantly, an "insufficient information" category. UFOlogists are quick to remind us of this last group since it seems to show that a valid "UFO" in the Institute's eyes was not a judgment based on partial information. Battelle was apparently confident that they had the information necessary to come to a valid conclusion. Other examples of the Battelle Institute's discrimination included the separation of all reports into four reliability groups, deemed "excellent," "good," "doubtful," and "poor"; these were based on the witnesses' ages, occupations, training, fact-reporting abilities, and attitudes, coupled with the report's completeness and internal consistency. For a report to be judged "unknown," a decision by the entire identification panel was required. Finally, the "unknowns" didn't have to include threshold decisions, since these could have easily been dropped into the "doubtful IFO" categories provided. Why do UFOlogists think that the Air Force burned themselves with Report 14? The project separated all of the UFOs from the IFOs and pigeonholed both groups into six descriptive categories: shape, color, duration, speed, number, and brightness. The UFO and IFO reports were compared for similarity of characteristics, using a statistical device known as a chi-square test. This yields a numerical correlation in percentages for the different characteristics being compared. Under the criteria of this comparative test, the Air Force bet that the UFOs would behave like the IFOs. The Air Force lost. In each of the six comparisons, the chi-square test resulted in a less than 1 per cent chance of the UFOs and IFOs representing the same stimuli. As if to prove they wanted a non-exciting conclusion, the Air Force issued a statement that the subjectivity of the data resulted in a false conclusion. If the data was so subjective and unreliable from the start, why did they undertake the statistical project in the first place? There was no mention in the press release of the chi-square test results that formed the bulk of the work. This, coupled with the contradictory statement that the project did not support the existence of UFOs, demonstrated the Air Force's biased attitudes.

There is a good deal more about the Battelle study that can be exploited by UFOlogists. In the press release concerning Special Report 14, the Air Force boasted of a low 3 per cent UFO rate plus a low 7 per cent "insufficient information" rate for the first four months of 1955. They failed to mention that the Battelle study evaluated 20 per cent of all the sightings from mid-1947 to 1952 to be "unknowns"! The Air Force claimed that the refinement of their investigative methods by 1955 reduced both the unknowns and the insufficient information cases. Yet using figures later released by the Air Force, look at how the following years fared:

UNKNOWNS INSUFF. INFO.

1959 3% 17%

1963 4% 15%

1966 3% 24%

Hardly what you would call "improved" case investigation with that many incomplete reports! When the Battelle Institute identification panel re-examined the Air Force material, they consistently came up with higher percentages of "unknowns" than did the Air Force investigators:

AIR FORCE† BATTELLE

1947 10% 28%

1948 5% 11%

1949 12% 12%

1950 13% 23%

1951 13% 27%

1952 20% 20%

Furthermore, a group decision required for a report to be deemed "unknown" helps outweigh the decisions of any single UFO skeptic. An unexpected result took place when the reports were differentiated into the four categories mentioned earlier, "excellent" through "poor." The "excellent" reports contained an exceptional 33 per cent unknown rate coupled with a low 4 per cent insufficient information rate. The reports judged "poor," however, contained only 17 per cent unknowns and 21 per cent insufficient information. Hence, by Battelle's own judgment, UFOs were not the province of poor observers or reports. Remember, too, that there was a separate category for insufficient information reports, so that couldn't be charged, either. As researcher Bruce Maccabee has pointed out, if there were no true unknowns, the observers in the "excellent" observer class saw more of these false unknowns as judged by the Battelle group. This is, of course, paradoxical. The Battelle group also tried to see if there was a single model that would describe all of the unknowns. They concluded that they had only twelve UFO accounts that were described in sufficient detail to permit an artist's sketch to be drawn . . . but whose fault was that, except the investigators'? On the basis of the variety encountered in twelve cases, it was concluded that no single model of a UFO was emerging out of the reports. Thus, the concluding sentence of Special Report 14 reads, "It is considered to be highly improbable that any of the reports of unidentified aerial objects examined in this study represent observations of tech-

†Based on final figures released in 1969.

technological developments outside the range of present-day scientific knowledge." What a red herring! The chi-square comparison tests could never have hoped to prove or disprove anything like that. All the project was designed to do was examine whether or not the UFO reports, in the main, could be "likened" to the IFO reports. Utilizing the chi-square tests, it succeeded ... in making the UFOs a unique class of objects. Yet, did the Air Force really have to worm their way out of this undesired conclusion in such an awkward way? Let's take a second look at the data. The six categories chosen for comparison— shape, duration, color, brightness, number, and speed—plotted on graphs for both UFOs and IFOs look like this:

Surprised? In graph form, the bodies of data don't seem especially different at all; indeed, one might predict that a chi-square test would show a strong numerical correlation for the different profiles. The Battelle Institute thought so, too, after examining their own graphs. Yet the application of the chi-square test showed a probability much less than 1 per cent that the UFO/IFO distributions represented the same characteristics for five of the six categories. "Brightness" showed a reasonable match, well beyond the 5 per cent level. You know the project was unhappy with the result when they immediately sought ways to fudge the results, such as excluding the abundance of astronomical sightings from the knowns to create a better fit. Yet my own assessment is that they shouldn't have interpreted the chi-square tests so literally. Here's why:

1) INPUT IS NOT ABSOLUTE

Consider the six categories chosen in light of the discussion in the IFO Message chapter:

Shape. Recall all the varieties of false shapes that were attributed to ad planes, aircraft seen only by their lights, even stars; look through the six IFO sections in this book at the different shapes atributed to reports which could otherwise be confidently identified, then ask yourself how literally the shapes provided for both the "knowns" and the "unknowns" by the Air Force's witnesses should be taken.

Duration. As you will recall from the previous discussion in this chapter, duration turned out to be more an arbitrary condition of the witness, not the object being viewed.

Speed. Also discussed previously; if you can't determine the distance or size of an unknown projectile, you can't assess its speed with accuracy, not to mention those head-on "stationary" planes and clouded-out stars that "rush away in seconds."

Brightness. The most difficult of the six to relay accurately in subjective terms.

Color and number inspire more confidence for accuracy, but are less substantial as discriminators than the other four for separating UFOs from IFOs.

Given the large latitude for error for these factors borne out in my own IFO study, the Special Report 14 staff should not have tried to pin down comparisons any tighter than the graphs they drew, which did indeed demonstrate basic congruence of the six different profiles. The catch in utilizing a chi-square test is that the numbers you plug in to the formulae are expected to be absolute, which these are not. Since all that matters in chi-square is the differences that exist between two bodies, the "unknowns" could merely be a different numerical distribution of "knowns" from the ones weeded out and could still generate the low correlation figures. Don't forget, as large as the data sample was in this test, the body of "knowns" was really a combination of stars, planes, balloons, kites, birds, and other unrelated items. Another sample started from scratch could likely have had a different "character." Simply because the Battelle Institute was able to quantize their reports doesn't mean that the numbers should have been taken so seriously as to subject them to the rigors of a sensitive test like chi-square. If the Battelle group had had a real appreciation for how loose the data were, they never would have bothered with a statistical comparison to begin with. Their behavior in the conclusion section makes it clear, however, that they wanted to take the numbers seriously ... if the results would have enabled them to "kill" the UFOs.

2) INADEQUACY OF THE SIX FACTORS

The next thing that bothers me is the notion that shape, color, duration, brightness, number, and speed are sufficient to discriminate the identities of UFOs and IFOs. I could wish that other, more important considerations like "angular size" or "manner of disappearance" had been included, because using Battelle's six chosen categories, one could state a strong case for a "good" chi-square relationship between city buses and Indian elephants:

Furthermore, an important factor like shape was reduced to only six subtypes, one of which was "other." "Elliptical" ran away with most of the shapes, while the others represented rather strange choices that leave a lot to be desired. My own questionnaire had eleven highly varied shapes, one of which was left open for a description not covered by the rest. Believe me, I would have had a terrible time trying to fit them into the Battelle study's shapes.

3) INCOMPLETE REPORTS

That the project should divide its unknowns into excellent, good, doubtful, and poor classes of completeness and reliability sounds reassuring at first. But look at the amount of cases, both "known" and "unknown," that had unstated parameters:

UNSTATED: IFO REPORTS UFO REPORTS

Shape 23% 27%

Duration 25% 24%

Color 12% 14%

Brightness 65% 68%

Speed 39% 31%

Number 2% 1%

This is outrageous! In my own reports, I would never have dreamed of making an IFO/UFO judgment without important parameters like shape and duration. Instead of dumping these reports into the "insufficient information" pile where they belong (or better yet, seeking out the additional data) they saw fit to make commitments on them. To judge reports like these as "UFOs" and "IFOs" and to include them in the chi-square tests is sloppy investigative and statistical process. Nor should they have included "not stated" figures in the chi-square tests at all, since "not stated" is not a characteristic of the "knowns" and "unknowns," just of poor investigation.‡ No wonder the study could find only 12 UFO descriptions out of 434 that were complete enough to create a sketch.

In short, the Air Force shouldn't have had to wriggle out of releasing the body of the report to the public; their only real error in Special Report 14 was undertaking the statistics in the first place.

‡Bruce Maccabee has refigured the chi-square tests without them—the results are still largely the same.

Conclusion

There is hardly a statistical effort that has ever been applied to the UFO phenomenon that is not problematic in its construction or interpretation. Short of some seemingly impossible changes in the collection mechanisms that feed these efforts, it hardly seems likely that such efforts in the future will fare any better. That doesn't mean that they won't be undertaken, as the temptation to reduce large bodies of UFO data to statistical conclusions is very strong; "overinterpretation" is always the real menace. So the reader should ask himself some rather stiff questions when he encounters future efforts of this kind:

1) Does the report collection reflect truly random sampling?

2) Have the individual entries been adequately validated? Remember, a thin veil can often separate an identifiable object from a truly "worthy" UFO. Beware of statistical exercises that boast of thousands of reports in the data; there aren't thousands of well-investigated reports. Also, beware of attempts to catalogue every known UFO in a certain category (e.g. the Center for UFO Studies' Physical Trace [CE II] Catalogue) when cases are plugged into the statistics whether they are anecdotal or well studied. Efforts to weight the probability of cases or even to divide them into certain or non-certain groups virtually never appear in these compilations. When this did happen in Report 14, for example, it was ignored and all four levels of report quality were stuffed into the statistics so there could be sufficiently large numbers.

3) Are "apples and oranges" being summed or compared? Are NLs necessarily the same genre of UFO as DDs or CE II's? The huge variety of UFO shapes and behaviors works against any casual grouping of UFOs. Collections of IFOs are a mixture of many unrelated sources with unrelated characteristics—is this also true for the UFOs?

4) Beware of attempts to obscure the different details among cases by oversimplifying them and then comparing them. All three of these examples could be reported as a "3.5-foot-tall green humanoid enters UFO": A 3.5-foot-tall being wearing a green suit and head covering with a lens glided down from an 18-foot metallic disc with a central opening on its underside and three visible windows. A 3.5-foot-tall creature with a large bald head, large red eyes, and a green skin was reported by a ten-year-old boy to float outside his bedroom window at night. The creature then retreated to a cube-shaped craft with red lights. A 3.5- foot-tall creature with a dark head, luminous eyes, a clear, spherical helmet attached to a backpack by a tube, and glowing, tight green coveralls was seen carrying a device like a "mine detector." It literally walked up a wall and disappeared; it was again seen in a domed disc which rose away from the scene.

5) Beware of attempts to collect as many reports as possible of, say, EM interference cases or abductions accompanied by the question: "can all of these people be wrong [or liars or crazy]?" Remember, for every valid UFO judgment there were nine UFO impostors— IFOs—where the sincere witnesses were most certainly wrong. You seldom get reminded of this truism, however.

6) The worship of correlation is one of the major mistakes made in evaluating statistics. It is often thought that if statistics succeed in determining a high correlation between two or more sets of objects, or effects, then causality has been determined, regardless of the relationship between the sets compared. One publication on "Fortean" phenomena (a wide variety of anomalous events, "falls" from the sky, creatures, etc.) tried to tie them all together (UFOs included) by virtue of their appearance in space and time in the United States, leading to a unified explanation theme for the whole lot. The truth is that it cannot be claimed that two effects are casually connected simply because they vary concomitantly; however, it can be claimed that if one effect varies while the others holds constant, then the effects are not causally connected. Here are two different kinds of false cause-and-effect relationships that really took place:

Disguised causality: A car owner complained to the dealer that whenever he drove to the grocery store, his success in restarting his automobile to go home depended on the flavor of ice cream that he bought. If he bought vanilla, he had no trouble starting his car; the opposite was true if he bought chocolate, strawberry, or any non-vanilla flavor. An engineer from the company accompanied the man on his shopping trips and confirmed that this connection did happen . . . but not for mystical reasons. The counter with vanilla ice cream was located next to the front door; all the other flavors were in the back of the store, which required more time to shop, and time for a vapor lock to form in his carburetor.

Random causality: Bernard Fremerman, writing in the Zetetic Scholar, has prepared a plot of annual fluctuations in American industrial manufacturing production vs. annual sunspot variations, ranging from 1870 to 1970. The nineteen peaks and dips in the eleven-year cycle of both curves match up remarkably well. Startling as this correlation appears, it does not point the way to sunspot number as a primary reason for changes in American manufacturing output.

The deeper question to be asked about the significance of statistics applied to the UFO phenomenon is: do UFO statistics represent a valid pursuit for more knowledge about this elusive phenomenon, or do they merely reflect frustration that none of the individual reports are capable of standing on their own two feet? Are UFO statistics a bold first step . . . or a desperate last resort?

The perils of statistics in ufology

Recent Posts

Comentários