Skip to main content
Animal Models

General Guide to Behavioral Testing in Mice

By May 13, 2017February 24th, 2020No Comments


It is common practice in science to test new drugs on rodents before testing them in human populations. These tests consist of administering drugs to rodents, then testing various aspects of their behavior which reflect cognitive functions, sensory-motor function, social interactions, anxiety, depression, and substance dependence. However, there are many points to consider when conducting behavioral testing in rodents.

Generalizability to humans, how test groups are selected, individual differences in rodents, their living conditions, animal-experimenter interactions, test repetition, how data is collected, and how results are interpreted, are all factors that greatly affect the results in behavioral testing. All of these factors can influence the reliability and validity of the behavioral measures.

Factors Affecting Behavioral Testing in Mouse

Rodent Sensory Modality

Since the goal is to have results which are translatable to humans, it is important to be aware that rodents perceive the environment differently than humans. Rodents, unlike humans, do no rely heavily on vision, they rely on their olfactory system. When rodents are using vision, they use it best in nocturnal environments, as their physiology is set-up for nocturnal life-style. Humans, on the other hand, use their vision more effectively in the day than the night. Other differences are rodents use of their whiskers as sensory organs and their ability to both detect ultrasound and use it for communication. It should also be noted that several commonly used mouse strains have restricted hearing abilities. Furthermore, lack of stimulation in a rodent’s environment, which is typical in laboratories, can lead to impaired sensory development.

When designing an experiment, it is important to select tests and animals in a way that the experiment’s results are transferable to human populations.

Selecting Groups

The method of selecting groups can affect behavioral testing. Assigning rodents to groups based on complete randomization can lead to the two groups being unequal in ability. So, semi-randomization must be used, where in the groups are created so that they have equal ability. If the treatment and control group are not similar in their abilities, then results would not be valid, since one group may be better at the task before testing even begins.

Task Difficulty

Task difficulty in behavioral testing is one factor that can influence validity of the behavioral measures. If a task is either too easy or too hard, differences between the two groups can’t be detected. Therefore, it is important to use correct parameters in the experiment. A task that isn’t demanding can have ceiling effects, meaning even impaired rodents have close to optimal results. On the other hand, a task that is too difficult can have flooring effects, wherein all animals fail. Tasks which continually increase in difficulty or stimulus intensity fall between these two prior categories and are good for testing rodents with a broad range of abilities and trait levels. An example of such testing is the ledged tapered beam. In this test the beam starts out wide, making it easy to traverse, but gradually it becomes narrower and narrower, increasing the difficulty of the task.

Rodents are Individuals

When selecting rodents for the experimental groups, it is important to remember that rodents are individuals and as such can behave differently. They have different maternal behavior and their female’s estrous cycles vary. When testing drugs on rodents, each individual can have varying plasma levels as well.

Another factor which varies between individual rodents is their motivation. Since behavioral tests may evaluate an animal’s ability to solve a task or it’s motivation to complete an action.  A confounding factor in these studies can be the different cognitive abilities and/or motivation of the rodent to complete a task or perform an action. To solve the problem of difference in motivation many tests attempt to get the rodent to perform at its top motivational level by using a highly motivational stimulus. Fear and hunger are two ways typically used to induce the highest level of motivation. However, using these factors as motivators can influence the animal’s later performance.

Living condition is another important factor to consider. This includes housing, wherein the rodent may be housed with other rodents which has an influence on the individual. Transitioning rodents between cages while cleaning cages can induce anxiety and light-conditions can influence the rodent’s behavior.

Animal-Experimenter Interactions

Animal-experimenter interactions can have effects on the rodents resulting in different behavioral results. These interaction influences can occur simply from different experimenters implementing testing protocols in minor ways or how experienced or comfortable the experimenter is working with rodents.

Rodents can tell the difference between male and female testers (Sorge et al., 2014). They can also become familiar with experimenters.

Test Repetition

Test repetition is a factor that comes in at the experiment design level and can influence results. It is often necessary to repeat a test, for instance in developmental research, ageing research, or when a disease is being studied which has a dynamic and prolonged course.  Additionally, baseline testing is necessary to select two groups of rodents which are similar in performance and cognitive ability before beginning actual testing. The baseline test should measure the rodents’ ability to perform in the actual experiment and therefore, the baseline test and actual experiment should not be too different.

Experiments which are more complicated can require not only baseline testing, but also extensive training before starting the actual experiment. An important point to consider is that when rodents perform experiments in behavioral testing, they are able to learn skills which they carry from one test to the next.

Test repetition can lead to learning effects, practice effects, and generalization. Some possible ways to reduce repetition effects may be by allowing time between tests and varying test order. Another strategy is to use a test that doesn’t just measure a single behavior, but multiple behaviors. An example of such testing is the multivariate concentric square field test. This test uses a complex arena to measure multiple behaviors and later create a behavior profile for the rodent.

Data: Collection & Analysis

When conducting experiments, there is always a possibility for bias to occur; meaning that the experimenter (not necessarily consciously) may judge the behaviors of the rodents or other results in a manner that supports the scientist’s hypothesis instead of in an objective manner. This can really become a problem when the person that is doing genotyping, surgery or administering treatment is also conducting the behavioral tests. In order to eliminate this problem, it is very important to use blinding methods. These are methods that make it so the experimenter is either totally unaware or partially unaware about information pertaining to the testing rodents. For instance, one experimenter could label each animal with a number and link this number to more information about the rodent, then another experimenter could conduct the experiment with no information about the rodents, besides the arbitrary numbers they were assigned. For more information about guidelines on how to avoid bias refer to the Camrades and ARRIVE guidelines (Kilkenny et al., 2010).

Data collection and result interpretation can also become a source of decreased reliability and/or validity. Rodent behavior is often recorded using categorical scales, since it is difficult to classify their behavior using a single continuous variable. This scoring can be subjective and not objective if the experimenter is aware of the treatment the rodent has received. Additionally, behavioral results can be hard to interpret since behaviors can have multiple underlying causes. An experimenter observing a behavior may classify it as one they wish to see while in fact, the behavior is not caused by the underlying mechanism that the experimenter wishes to observe. This can result in a false observation.

Considerations Before Testing

Considering all the factors that can influence the reliability and validity of behavioral testing in rodents, it is prudent to try to create a general guide for conducting rodent behavioral research. Some possible solutions for increasing reliability and validity in rodent behavioral testing are:

  1. A priori rodent selection
  2. Selecting the appropriate experiment design
  3. Possibly redesigning existing tests so that interpretation of the results is objective and not subjective
  4. Implementing automated testing.

A priori selection of the type of rodents to be involved in the testing can involve deciding which gender, age, and species of the rodent should be tested. Also, it should be taken into consideration that their sensory modalities are different than humans. Another factor to consider is the housing condition under which the rodent has been in. Have they been housed with other rodents? What light-dark cycles have they been exposed to?

Selecting the appropriate experiment design is very important. Some factors to consider here are: generalizability of the experiment to humans, rodent group randomization, human-rodent interaction, test repetition, and experimenter bias. Other considerations include the practical and economic constraints of the experiment, along with the ethics of the experiment.

It might also be possible to redesign an existing test so that interpretation of the results is objective and not subjective. For example, modification of the Elevated Plus Maze in the Elevated Zero Maze consists of removing the center zone, since the amount of time the rodent spends here can be hard to interpret (Shepherd et al., 1994; Braun et al., 2011).

A better way to avoid some of these problems is to utilize novel testing methods which allow for automatization. Automatization of testing allows the conditions of the experiment to be consistent for all of the rodents which participate in the experiment. For instance, automated testing can be implemented in Operant Chambers, with the only possible confounding factor being when the animal is placed in the testing chamber. Some examples of automated mazes are 8 Arm Radial Maze, Elevated Plus Maze, T maze, and Y maze.


There are many factors to consider when performing rodent behavioral testing. Suggestions for increasing reliability and validity in rodent testing have been discussed. With all of these in mind and proper caution taken, the factors that affect behavioral testing results will be greatly reduced, leading way to more accurate results.


  1. Braun, A.A., Skelton, M.R., Vorhees, C.V., and Williams, M.T. (2011). Comparison of the elevated plus and elevated zero mazes in treated and untreated male Sprague dawley rats: effects of anxiolytic and anxiogenic agents. Pharmacol. Biochem.Behav. 97,
  2. Hånell, A. and Marklund, N. (2014). Structured evaluation of rodent behavioral tests used in drug discovery research. Frontiers in Behavioral Neuroscience, 8:252. doi: 10.3389/fnbeh.2014.00252
  3. Kilkenny, C., Browne, W.J., Cuthill, I.C., Emerson, M.,and Altman ,D.G. (2010). Improving bioscience research reporting: the ARRIVE guidelines for reporting animal research. PLoSBiol. 8:e1000412. doi:10.1371/journal.pbio. 1000412
  4. Shepherd, J.K., Grewal, S.S., Fletcher, A., Bill, D.J., and Dourish, C.T. (1994). Behavioural and pharmacological characterization of the elevated “zero-maze” as an animal model of anxiety. Psychopharmacology(Berl) 116, 56–64.doi:10. 1007/bf02244871
  5. Sorge, R.E., Martin, L.J., Isbester, K.A., Sotocinal, S.G., Rosen, S., Tuttle, A.H., et al. (2014). Olfactory exposure to males, including men, causes stress and related analgesia in rodents. Nat. Methods 11, 629–632.doi:10.1038/nmeth.2935
Close Menu