Prepare for the Society of Actuaries PA Exam with our comprehensive quiz. Study with multiple-choice questions, each providing hints and explanations. Gear up for success!

Each practice test/flash card set has 50 randomly selected questions from a bank of over 500. You'll get a new set of questions each time!

Practice this question and more.


How is entropy defined in the context of decision trees?

  1. A measure of the depth of a tree

  2. A measure of the impurity of each node

  3. A fixed value used for all nodes

  4. A measure of total node count

The correct answer is: A measure of the impurity of each node

In the context of decision trees, entropy is defined as a measure of the impurity or disorder of a node in the tree. It quantifies the unpredictability or randomness associated with the classes of the data points at that node. When building a decision tree, the goal is to create splits in the dataset that result in child nodes that are as pure as possible, meaning that they contain instances predominantly from a single class. Calculating the entropy involves determining the proportion of each class within the node and applying the formula for entropy, which sums the negative probabilities of each class multiplied by the logarithm of those probabilities. The lower the entropy (closer to zero), the purer the node, as it indicates a predominance of one class. Conversely, higher entropy indicates a more mixed or impure node. By using entropy to measure impurity, decision tree algorithms can make informed decisions on how to partition the data effectively, leading to better classification performance. This concept is central to how decision trees are constructed, as minimizing entropy at each split helps to create a more efficient model.