Some basic principles: First, the higher the frequency of affected animals, the easier detection of any gene can be. In an ideal case, half the animals would be diseased and half, healthy. But of course, that case should (certainly we all hope) be unlikely. The point to remember is that analyzing a rare event is more difficult than a more common event. Second, the trait should be one that can be objectively evaluated. Disease traits are usually easily classified. But behavior traits and some structural traits defy a unique, repeatable and objective scoring system. Although it is possible to evaluate subjective characters, the power to do so effectively is diminished when compared with the analysis of easily quantifiable traits.
Sample Size: “What is the sample size I need to find a gene?” This is probably the most common question, yet the most difficult to answer. “The more the merrier” would be the simplistic reply. A realistic answer is tied to the incidence of the disease/character under study. To do an adequate job in a rare condition will require more animals and families than that for a common ailment. A comfortable, round figure would be a minimum of 500 to 600 animals (with accompanying records) from at least 100 families. The figure of 500 animals does not include animals in a pedigree that have an unknown phenotype. In addition, the term family is not limited to the typical father-mother-son-daughter. Instead, the term family is meant to include all animals in a given pedigree, such that a family is a list of animals that does not overlap with the list of animals in a different family. If there is overlap in a list of family names, then these apparently different families are actually in the same family. A trait with an incidence below 10% will likely need more than 500 animals to get an accurate sense of the patterns of inheritance. Remember that these are rough, “ballpark” figures. More than 500 animals, spread across a great number of families, will always be welcome.
Critical Information: Before any analysis can be undertaken, the data must be collected in a computer-useable form. Pictorial pedigrees are nice, but mean little to a computer and hence have to be translated into a form more amenable to analysis. A spreadsheet usually works quite nicely. Nothing elaborate is needed. Just a sheet of information with columns for: Animal Name, Animal ID Number, Sire Name, Sire ID Number, Dam Name, Dam ID Number, Date-of-Birth, Sex, and Phenotype (including a code if this information is missing). Numbers are usually better than names, because misspellings are so common, and there is no way for the computer to know that “Daisy May” and “DaisyMay” are the same dam. Additional information on dogs is always helpful, such as coat color codes, body weight, or any other measurement or condition that may be related to the trait of interest. The best, general advice is to write down everything you know about each animal and include that information (preferably coded) as part of the data line for each animal in the data set.