Cool and hot executive functions in relation to aggression and testosterone/cortisol ratios in male prisoners

‘Cool’ executive functions (EF) refer to logical and strategic cognitive processes such as planning and reasoning, whereas ‘hot’ EF include affect-driven cognitive processes, such as risk-taking in decision making. In the present cross-sectional study was investigated whether prisoners perform worse than non-prisoners on measures of hot and cool EF. Subsequent objectives were to determine if performance on tasks of executive functioning was related to measures of (reactive and proactive) aggression within the offender group, and whether testosterone and cortisol influenced the latter relationship. Male prisoners ( n = 125) and a non-offender control group ( n = 32) completed frequently applied measures of hot and cool EF (assessed with the Iowa Gambling task and Wisconsin Card Sorting Task respectively). Aggression characteristics in prisoners were assessed through self-report questionnaires, behavioural observations, and conviction histories. Endogenous testosterone and cortisol levels were obtained through saliva samples, while prenatal testosterone exposure was determined using the finger length of the index and ring fingers (the ‘2D:4D ratio’). The results indicated that prisoners performed significantly worse than non-prisoners on cool EF, and to a lesser extent on hot EF, but no meaningful relationship could be proven between measures of EF and aggression in the offender group. Weak to moderate significant correlations were found between testosterone/cortisol ratios (not prenatal testosterone exposure) and hot EF as well as self-reported aggression. These results lead to the conclusion that prisoners show significant problems in cool and hot EF compared to non-prisoners. These problems are not clearly associated with characteristics of aggression, but preliminary results indicate that these may be related to having high endogenous testosterone levels relative to cortisol levels.


Introduction
Executive functions (EF) usually refer to deliberate, topdown neurocognitive processes involved in the conscious, goal-directed control of thought, action, and emotion, of which mental set-shifting, planning and monitoring, and inhibition of prepotent responses are the most wellknown [1,2].These EF have been labelled as relatively 'cool' cognitive functions, in which reasoning plays an important role.In contrast, so-called 'hot EF' refer to more intuitive top-down control processes that operate in motivationally and emotionally significant high-stakes situations [1,3], such as risk taking in decision making [1,[4][5][6].Although cool and hot EF typically work together as part of a more general adaptive function, they also show to be relative independent constructs and appear to rely on different neuro anatomical structures [1].While performance on traditional, cool neuropsychological EF-tasks predominantly depends on functioning of dorsolateral prefrontal regions, performance on hot EFtasks mainly relies on functioning of the ventromedial and orbitofrontal cortex [7][8][9], which, for example, plays an important role in the reappraisal of initially learned response-reward contingencies [7].
Deficits in EF are associated with a wide range of problems in daily life functioning, including criminal and aggressive behaviour, which is often due to insufficient self-regulation [10].This robust relationship between poor EF and criminal/aggressive behavior has been well established in two large meta analyses [11,12].However, such relationships have mainly been studied using traditional cool EF tasks and to a far lesser extent with hot EF measures [11].With respect to the latter, there is evidence that the tendency to take risk in decision making seems to be related to aggressive behavior [11,13,14], but it is not yet clear upon which underlying neurobiological mechanisms this depends and if this relation is similar for all types and degrees of severity of aggression.It appears that the tendency to take risk in decision making is mostly related to reactive aggression, which refers to impulsive aggressive behaviour as a result of high emotional arousal, often in response to a perceived provocation or threat.Risky decision making is related in a lesser extent to proactive forms of aggression, which points to goaldirected, instrumental aggressive behaviour in order to obtain a desirable advantage [14,15].This relation between risky decision making and reactive aggression could be explained by the fact that both are linked to disfunctioning in similar neurological substrates, such as the striatum, the orbitofrontal and ventromedial prefrontal cortex, and the amygdala [16][17][18][19][20]. Furthermore, it has been suggested that steroid hormones such as testosterone are key regulators of the functioning of these brain regions [18,21].Indeed, high testosterone levels have been related to both increased (reactive) aggression on the one hand [18,[22][23][24][25], and more risk taking in decision making on the other hand [26][27][28][29].However, the relationship between testosterone on the one hand and decision making and aggression on the other hand is complex and not always confirmed in literature [30,31].This may be due to the fact that the effect of testosterone on aggression and decision making is moderated by cortisol levels in such a way that high testosterone only results in increased aggression and impaired decision making in individuals with low cortisol levels [32].Multiple studies that used the testosterone/ cortisol ratio confirm this "dual hormone hypothesis", in relation to both aggression and risk taking [22][23][24]26,[33][34][35].
Overseeing the foregoing, it can be concluded that there is compelling evidence that antisocial behaviour, including aggression, is related to poor performance on cool-EF tasks, but that less is known about how hot-EF measures concerning risk taking in decision making are related to (reactive) aggression.The latter is important to investigate, not only because it has been suggested that hot EF are crucial cognitive functions for social functioning [7], but also because hot EF may be important to address directly in the treatment for pathological aggression [36].Furthermore, studies are compiling that confirm that both hot EF and reactive aggression may be influenced by testosterone/cortisol levels in different parts of the brain [22][23][24]26,[33][34][35].However, few studies have taken all these factors together in one study or investigated this in offender populations.In addition, little is known about the potential role of testosterone and cortisol in the relationship between aspects of aggression in cool EF.Hence, the present study aims to investigate (1) whether male prisoners perform worse than non-prisoners on measures of cool and hot EF, (2) if outcome on measures of hot and cool EF can be statistically predicted by measures of aggression, and, (3) whether this relation between aggression and EF may be influenced by testosterone/cortisol levels.It was hypothesized that (1) prisoners perform worse than non-offender controls on both measures, but (2) show a stronger hot-EF-(reactive) aggression relation than a cool EF-(reactive) aggression relation, and that (3) the relationship between reactive aggression and poor performance on hot EF-measures is related to levels of testosterone/cortisol in the way that high levels of endogenous testosterone paired with low cortisol are related to higher amounts of reactive, but not proactive, aggression and worse performance on a measure of hot, but not cool, EF.

Setting, participants and procedure
Participants were recruited in a large prison setting in the Netherlands (Penitentiary Institution Vught).All participants were adult males (18 years and older), who volunteered for this study after being informed through posters, pamphlets and information letters.A total of 159 participants initially entered the study.Of these, 34 dropped out for different personal (e.g.decline) or practical reasons (e.g.sudden transfer/release), resulting in a total of 125 participants.All participants were accused of or convicted for committing criminal acts, varying from minor offenses, such as theft, to severe violent crimes, such as murder, or sex crimes.Sentences varied from several weeks to life-long imprisonment, sometimes in combination with special treatment programs.98 Participants had at least one lifetime conviction for a violent crime, while 27 were convicted of non-violent crimes only.
Inclusion took place when participants were currently stable enough to participate (e.g.not suffering from a psychotic, manic or major depressive episode 6 months prior to testing).Furthermore, participants needed to be able to read the Dutch language well enough to fill in questionnaires.
For each participant in the prisoner group, the study procedure lasted for four weeks in total.During this period staff members rated aggressive behaviour of the participants on a weekly basis.Participants had three meetings with a research assistant.In the first meeting informed consent was signed, descriptive data were gathered, and an intelligence screening was performed.During the second meeting, neuropsychological testing took place and aggression questionnaires were filled in by the participants.Also, saliva samples were collected, following a structured protocol (for details see 'materials').If appreciated by the participants, they were provided with feedback on their individual test results in the last meeting.After the testing phase, judicial records were studied to obtain data on each participant's crime history.
The non-prisoner control group (n = 32) consisted of male prison-employees, who were recruited through letters, information meetings or e-mail.Prison-employees are all screened for records of good behaviour as part of a standard procedure upon employment, so it could be guaranteed that this group had clean criminal records.They completed neuropsychological testing only and did not participate in assessments of aggression or testosterone/cortisol.This study was conducted according to the ethical principles from the Helsinki Declaration and approved by the Dutch Ministry of Justice and Security with respect to procedural and ethical aspects.All participants signed informed consent.Providing a saliva sample was optional and this required additional informed consent.No rewards were provided for participation.

Materials Reactive and proactive aggression
Characteristics of aggression were measured in three ways: through self-report questionnaires, behavioural observations and criminal records.The selfreport questionnaires included the 30-item Impulsive/ Premeditated Aggression Scales (IPAS-30 [37]), which provides a scores for impulsive and a for instrumental aggression, the 23-item Reactive-Proactive Aggression Questionnaire (RPQ [38]), which provides scores for reactive and proactive aggression, and the shortened 12item Dutch translation of the original 29-item Aggression Questionnaire (AQ [39,40]), which provides four scales (physical aggression, verbal aggression, rage and hostility).The 12-item AQ was used instead of the longer, original version, because this short version has shown to have better psychometric properties [41].Observational data of aggressive behaviour were gathered by use of the Social Dysfunction and Aggression Scale (SDAS-11 [42]).It is an observational scoring list, consisting of 11 items with a 5-point Likert scale.Staff members were asked to score the SDAS four times, with an interval of one week, so a stable total mean score could be calculated.A minimum of three ratings needed to be present in order to be included in the statistical analyses.Finally, criminal records provided information on conviction histories.The total number and type of (violent) criminal convictions was registered.

Hot and cool EF
The Iowa Gambling Task (IGT [43]) was conducted as a measure of hot EF.The IGT is regarded as an adequate measure of intuitive decision making in ambiguous and risky circumstances [44].Because it contains ambiguous reinforcers, it is supposed to resemble daily life decision making closely.In the present study, the IGT was assessed with a standard version of a computer task in which participants were confronted with four packs of cards.They were instructed to select one card at a time with the consequence of winning or losing fictitious money.Although participants were informed that some decks were better than others, they were not told which decks were advantageous (i.e.giving small rewards and small losses) or disadvantageous (i.e.giving high rewards and high losses).Each participant completed 100 deck draws, leading to a total score and five consecutive 'block' scores of 20 draws each.Normally, individuals tend to choose randomly at first, but develop a clear preference for safe decks during the final 40 drawings.Especially the last two blocks (representing the final 40 draws) need to be considered with respect to risky decision making [44], while the first three blocks are characteristic for decision making under ambiguity [45].The last two block scores and total NET-score were used for the statistical analyses.There is convincing evidence that IGT-performance explains a unique part of the variance in decision-making which is not attributable to either intelligence and/or traditional measures of executive functions (neither inhibition, set-shifting or working memory) [46].
To assess cool EF the computerized version of the Wisconsin Card Sorting Task (WCST [47]) was applied.Participants were instructed to organize different pictures within categories but received no insight in the underlying organizing principles prior to the test.Only feedback was provided after each sorting attempt as being 'right' or 'wrong'.When the right sorting principle (either by colour, form or number) was applied consequently and repeatedly, the rule changed without notification, requiring a flexible and analytical response in order to search for another categorizing-principle.In contrast to the IGT, performance on the WCST relies less on intuition, but more on logical thinking and deliberative decision making, and it is globally often applied, both for clinical diagnostic and research purposes [48].The number of achieved categories as well as the number of perseverative responses was included as WCST-outcome measures in the statistical analyses, because these can be regarded as the best general indicators for performance on the task [48].

Measures of testosterone and cortisol
Endogenous testosterone and cortisol levels of prisoners were assessed through saliva samples that were collected during resting conditions in the test room with Cortisol-Salivettes® [49].These are tubular synthetic swabs that absorb saliva when placed in the mouth, and are proven to be an effective saliva collection device for diagnostic tests even from low volumes samples and/ or samples with low cortisol concentration.Although prior research has shown that testosterone levels after use of these salivettes can turn out to be higher than after assessment through other salivary methods, this potential overrating is proven to be consequent across different samples [50].Saliva samples were collected in the afternoon, because of circadian changes in testosterone and cortisol levels [24,51].Participants refrained from drinking, eating or smoking 30 minutes prior to saliva collection.Then they washed their mouth with water and were told to place the Cortisol-Salivette® in their mouth and chew it for approximately 45 seconds before placing it back in the tube.This was repeated after 30 minutes.Both closed, marked tubes were then stored in a freezer at -20°C.After all participants were tested, the saliva samples were thawed and centrifuged 10 min at 2000 g to obtain clear fluids.The samples were analyzed by the Testosterone Saliva and Cortisol Saliva ELISA assays (of IBL international) conform the instructions for use.All samples were measured in duplicate and the sample volumes were 50 µl.Initially the testosterone concentrations of multiple participants turned out to be too high for analysis.Therefore, all samples were repeated in a 1: 3 dilution with sample diluent.
For each of the two saliva samples the testosterone/ cortisol ratio was assessed by dividing the testosterone value by the cortisol value.One mean ratio score was then calculated for the two ratio scores from each sample.
Prenatal exposure to testosterone was assessed by the 2D:4D ratio [52].Finger length in millimeters was measured for the index (4D) and ring fingers (2D) of both hands from the lowest line where the finger crosses over into the palm of the hand up to the top of the finger.The length of the index finger was then divided by the length of the ring finger.Normally, the right hand will best represent the testosterone level, but it's recommended to measure both [53,54].Those two 2D:4D ratios were analysed separately.

Raven Standard Progressive Matrices (RSPM):
The RSPM is a non-verbal intelligence test [55], where abstract reasoning is essential.Participants were instructed to fill in missing parts in a pattern, choosing from a set of options.The test was selected on basis of its completion time and applicability for people, who are not raised with the Dutch language.Dutch norms were applied [48] to provide percentile scores, which were next converted into IQ-estimates.

Statistical procedure
Comparisons of mean values within the two study groups were performed using t-tests, when data were normally distributed.Because much of the data were not normally distributed, non-parametric Mann-Whitney Tests were conducted to assess whether prisoners differed from non-offender controls in their distribution of scores on measures of hot and cool EF.Effect size estimates were calculated by converting z-scores [56,57].
To assess if aggression measures could be predicted by outcome on measures of hot and cool EF, bootstrapped linear regression analyses were performed with forced entry.Age and estimated IQ-scores were also inserted as predictors in the model.It was not possible to insert interaction terms for the dependent variable x age, or x IQ to control for moderator effects, because this would lead to a proliferation of predictors in the model.Those analyses were run separately to be able to detect how large the risk of moderation effects would have been.
The main predictors of interest were the two last block scores and total NET-score of the IGT, the number of perseverative and non-perseverative errors on the WCST and number of completed categories on the WCST.To assess whether subscale scores on the self-report aggression questionnaires reflected different aggression sub-constructs (e.g.impulsive versus instrumental aggression), an exploratory factor analysis with oblique rotation was conducted.All subscale scores were inserted as variables in this factor analysis.Subscales that loaded on the same factor could then be transformed to one dependent variable for that factor in the regression model (after calculating one mean score of the standardized values of these subscale scores) to reduce the number of analyses.
A missing value analysis was performed, because there were relatively large percentages of missing values on testosterone/cortisol measures, to determine if these data were randomly missing across the sample.
Initially, it was planned to insert the testosterone/ cortisol ratio as an interaction term in the regression analyses to investigate if the strength of the relationship between measures of hot and cool EF on the one hand and measures of aggression on the other hand would be different for participants with high or low testosterone/ cortisol ratios.Unfortunately, we were only able to collect a small number of saliva samples (n = 38), because most prisoners refused to contribute to this part of the study.In result, it was not possible to include these variables in the regression model.Instead, those data were analysed in a more exploratory manner to spot trends and relationships in the data.Correlations were calculated between measures of testosterone and cortisol versus EF and aggression variables.
Post hoc calculations were conducted using G * Power in order to compute the achieved statistical power.

Descriptives
The descriptives of the two study groups are provided in table 1.The offender and non-offender group did not significantly differ in their median educational score, although there was a trend suggesting a different distribution in educational scores, p = .11.There were statistically significant differences between these groups in mean age, p = .02,and mean IQ-estimates on the Raven Progressive Matrices, p ≤ .001.On average, the participants in the control group were older and had higher IQ-scores than prisoners.

Comparisons between prisoners and non-prisoners on measures of hot and cool EF
Preliminary analyses revealed that performance on the WCST measures was not correlated with performance on measures of the IGT, as was expected.
Figure 1 displays mean IGT-scores for the offender and non-offender group.The Mann-Whitney Test showed that there was no significant difference between prisoners and non-prisoners in the distribution of their scores on the five blocks of the IGT, although there seemed to be a trend in the expected direction during the final 20 card draws in block 5, U = 2.19, z = 1.61, p = .11.The Total NET score was distributed significantly different between the two groups in the expected direction, U = 2.28, z = 1.98, p = .048.However, the effect size was small, r IGT NET total = .16.
For the WCST there was a clear significant difference between the distributions of the scores between the prisoners and non-prisoners for the number of perseverative errors, U = 1026.5,z = -3.85,p < .001,nonperseverative errors, U = 1176.5,z = -3.16,p = .002,and completed categories, U = 2447, z = 3.32, p = .001.Effect sizes were all in the medium range, r pers errors = -.31,r non-pers errors = -.28,r comp cat = .27.Mean WCST-scores for both groups are displayed in figure 2. Note: 1 Educational level was based on the classification system of Verhage [58] in Dutch education with 6 levels of education: (1) not graduated from primary school, (2) only graduated from primary school, (3) vocational education, (4) Secondary vocational education, (5) Higher vocational education, (6) academic education.
2 IQ-scores were estimated using the Raven Standard Progressive Matrices, which provides an IQ estimate with a minimum set at 70.

3
Total number convictions, violent convictions and non-violent convictions refer to the total number of convictions the participants had during their lifetime altogether and, more specifically, for violent crimes and non-violent crimes.Violent crimes included assault, (attempted) manslaughter, (attempted) murder, armed/violent robbery, arson, sex crimes and possession of weapons.Non-violent crimes were, for example, fraud, theft and drug crimes.In order to determine if the aforementioned findings could be attributed to confounding pre-existing group differences in intelligence, a matched offender group was created with similar IQ-scores as the non-offender control group by means of propensity score matching (n = 30).When the Mann-Whitney tests were repeated for the matched group and the control group, no significant group differences remained for all variables of both the IGT and the WCST.However, the reduction in group sizes resulted in a loss of statistical power in such a way that the earlier found effect sizes were no longer detectable.
It was not possible to match the groups on age besides intelligence and still retain sufficient statistical power, and thus it could not be investigated if the group differences in cognitive performance were a mere reflection of an age effect.Correlations between cognitive variables and age in the offender group were significant for the number of non-perseverative errors, ρ = .20,p = .04,and completed categories on the WCST, ρ = -.31,p = .001,but not for non-perseverative errors on this task, ρ = .11,p = .25.This means that performance on the WCST declined with age.Since the non-prisoners were older than the prisoners and also performed better on the WCST, this suggests that correcting for age differences would have led to potential larger group differences in performance on the WCST.Therefore, it is unlikely that the found group differences in cool EF could be attributed to pre-existing group differences in age.Correlations for IGT measures and age were all close to zero, suggesting that an age effect for hot EF is unlikely.Additionally, even though age differences between the groups were statistically significant, the actual mean difference of 5.4 years can still be regarded as relatively small when it comes to its effect on cognition.

The relationship between measures of aggression and measures of hot and cool EF
Correlational analyses between aspects of aggression (total number of lifetime convictions for violent crimes, average score on observed aggressive behaviour and the subscales and total scores on the three selfreport questionnaires) revealed that these were all not significantly correlated to outcomes on the WCST or IGT, except for one: a greater number of lifetime convictions for violent crimes was weakly correlated to less achieved categories on the WCST, ρ = -.19,p = .046.
The exploratory factor analysis revealed that all selfreport scales, including the impulsive/reactive and instrumental/proactive aggression subscales, loaded together on one factor with an eigenvalue of 6.46, explaining 43.1% of the variance.Within RPQ the reactive aggression scale was highly correlated with the proactive aggression scale, r = .82,p ≤ .001.This was to a lesser extent also the case for the impulsive and instrumental aggression scales of the IPAS-30, r = .36,p ≤ .001.In addition, there was a weak correlation between the impulsive aggression scale of the IPAS-30 and the reactive aggression scale of the RPQ, r = .23,p = .005,and the instrumental/proactive scales of those instruments only correlated moderately, r = .56,p ≤ .001.In other words: the subscales of the self-report aggression questionnaires (IPAS-30, RPQ and AQ) were intercorrelated in such a manner that no separate aggression factors could be distinguished.With respect to the impulsive/reactive versus instrumental/proactive aggression distinction was found that there was too much overlap between scores of supposedly different aggression subtypes (impulsive/ reactive and instrumental/proactive aggression), and too little overlap between scales that were supposed to assess similar aspects of aggression.Therefore, no valid, distinguishable measure for impulsive versus instrumental aggression could be extracted from the data.In result, one mean aggression score was calculated representing self-report questionnaires, based on standardized values of the total scores of the IPAS-30, RPQ and AQ.
The exploratory factor analysis also revealed that the observational measures of aggression and the criminal records of convictions for violent crimes did not load on the same factor as the self-report questionnaires.Observational data from staff members on the SDAS were significantly, though weakly correlated to the number of convictions for violent crimes, r = .21,p ≤ .08.Therefore, three regression models were tested, each with a different Due to missing data the regression analyses were based on 116 participants for the mean self-reported aggression, 99 for the SDAS, and 114 for the violent crimes.
measure for aggression as dependent variable (the mean self-report measure, the mean score of the four SDAS ratings, and lifetime number of convictions for violent crimes).There were no violations of the assumption of linearity.Three participants were excluded from the regression analyses due to outliers on measures of aggression.The regression was bootstrapped due to the fact that multiple variables were not normally distributed.The residuals were independent, there were no problems with multicollinearity, neither were there signs of moderator effects for intelligence or age.Results of those regression analyses are displayed in table 2.
There were no meaningful contributions from WCST and IGT variables to the prediction of outcome on the aggression variables.Age and intelligence contributed significantly in the prediction of the mean self-reported aggression; although this contribution was small (R 2 for those predictors combined was .10).Conducting the regression analyses without correction for intelligence or age lead to similar outcomes for the contributions of the WCST and IGT variables.The statistical power in the regression analyses with eight predictors in the model, observed R 2 values of .13 and sample size of 122 was 60%, suggesting that the sample was too small to be able to detect a true effect of this small size.

The potential role of prenatal testosterone exposure and the testosterone/cortisol ratio
Because only a small number of participants agreed to provide saliva samples (n = 38), it was decided to not include testosterone/cortisol measures in the regression model as a predictor.Little's MCAR test revealed that the missing data were spread randomly across the offender sample, x 2 (15) = 8.18, p = 0.92.No imputations were made.
Testosterone levels were remarkably high for a lot of the participants after a first analysis.For that reason, saliva samples were diluted with sample diluent (to 1:3) and re-analysed.The latter results were then transformed to one mean score from two saliva samples, which were used for statistical analysis: mean testosterone = 432.20 pg/mL, sd = 218.71,mean cortisol = 4.35 nmol/L, sd = 4.50, mean testosterone/cortisol = 160.48,sd = 114.82.To provide a frame of reference, since no saliva samples were collected in the non-offender group, it is worth mentioning that in an earlier study with the same saliva sampling method performed in a sample of 722 Dutch men with anxiety and depression problems (mean age = 44.9), the mean testosterone level was 25.7 pg/mL (95%CI = 24.5-27.1) in the morning and 19.4 (95%CI = 18.4-20.5)in the evening [50].No statistical comparisons were made between testosterone levels in the present study and the latter values, however.
Table 3 shows the correlations between the testosterone/cortisol variables and the main variables for hot and cool EF and aggression.Having high endogenous testosterone levels relative to cortisol levels was significantly correlated to worse performance of two IGT-measures (representing 'hot EF') and to higher levels of self-reported aggression (based on the generated common factor).However, correlational analyses between levels of testosterone relative to cortisol and outcomes on all subscale scores of the selfreport aggression questionnaires showed no significant correlations whatsoever.Furthermore, no significant relation was found for WCST variables (representing 'cool EF'), observed aggressive behaviour or number of lifetime convictions for violent crimes.Finally, no significant correlations were found for prenatal testosterone exposure and cognitive or aggressive measures.

Table 3:
Parametric or non-parametric correlations between measures of endogenous testosterone/cortisol ratios and prenatal testosterone exposure (2D:4D ratio of both hands) versus measures of hot and cool EF and aggression in prisoners.

Discussion
The purpose of the present study was to investigate 1) whether male prisoners perform worse than nonprisoners on measures of cool and hot EF, 2) if outcome on measures of hot and cool EF can be statistically predicted by measures of aggression, and, 3) whether this relation between aggression and EF may be influenced by testosterone/cortisol levels.The outcomes of the research on these questions and the implications arising therefrom are discussed in subsequent order in this section.
A priori it was hypothesised that prisoners would perform worse than non-prisoners on measures of both hot and cool EF.This could only be partly confirmed in the data.Our study showed that male offenders performed more poorly than non-offenders on hot EF tasks, but the effect size was small (0.16).They also performed less well on cool EF tasks (effect sizes = 0.27-0.31),but these differences disappeared when the data were corrected for IQ.Furthermore, individual differences in test performance were large on both tasks, rendering a large variance in the data.This suggests that even though there were differences on a group level, bad performance on hot or cool EF is certainly not a shared characteristic between all prisoners.In order to better understand individual differences in EF performance within the offender group, it was investigated if performance on EF was connected to characteristics of aggression.
Consequently, the second hypothesis in the present study was that within the offender group there would be a stronger relationship between measures of hot-EF and (reactive) aggression than between cool EF and (reactive) aggression.This could not in the slightest way be corroborated by data from correlational analyses between measures of all aspects of aggression and outcomes on EF-tests, which were all, but one, non-significant.Unfortunately, a factor analysis revealed that no valid measure of reactive versus proactive aggression could be distracted from the data, because these factors were too strongly intercorrelated.Therefore, only aggression as a general concept could be investigated in three separate terms in the regression model (self-report aggression questionnaires, observational data on current aggressive behaviour, and the history of committed violent crimes).Contrary to expectation, the outcomes of the regression analyses revealed that measures of both hot and cool EF did not significantly explain the variance of each of those three aggression variables, in addition to the (small) variance that was already explained for by intelligence and age.This finding appears to stand in contrast to the conclusion that was drawn in the systematic review that was published earlier [15].In this review 16 empirical studies were examined on the relationship between risky decision making and aggression.Although this was not consistent among all studies, overall evidence was found across different forensic and non-forensic populations for a significant positive relationship between increased risk taking during decision making and higher levels of aggression, especially reactive aggression [15].Then, how can we understand the present negative findings with respect to aggression?A potential explanation is that problems with hot and cool EF are characteristic of antisocial traits or behaviour in general, but not specifically related to aspects of aggression only.The found relationship between (reactive) aggression and risky decision making would then be a consequence of the fact that (reactive) aggression is characteristic of general underlying problems, such as antisocial behaviour and/or personality traits, poor inhibitory control and impulsivity.This explanation is in line with one other finding in the aforementioned systematic review [15]: in the reviewed studies all participants from forensic groups appeared to make more risky decisions compared to non-prisoners, pointing to the fact that more antisocial aspects than only aggression relate to increased risky decision making [59][60][61][62][63][64].In line with this, two earlier mentioned large meta-analyses on the relationship between executive disfunctioning and antisocial traits also showed that all types of antisocial aspects were related to (mainly cool) executive disfunctions, not just to aspects of aggression [11,12].However, when looking at this from a neuroanatomical perspective, questions remain: both antisocial behaviour in general as (reactive) aggression in particular have been linked to orbitofrontal networks [7,59], so why would the first be related to cognitive disfunctioning on the IGT but not the other?In other words: this line of reasoning could explain why it was found that prisoners performed worse than non-prisoners on the EF-tasks in the present study, but not exactly why we failed to find a relationship between measures of aggression and EF within the offender-sample.
When looking into this on a more conceptual neurocognitive level, one can argue that it is too complicated to investigate specific neurocognitive functions with the present task selection.Not only is aggression a broad, complex concept, influenced by a lot of (neuro) cognitive, emotional, physical and environmental factors [60], the applied tasks in the present study also rely on multiple complex neurocognitive processes [2,46,61].To be able to assess fundamental problems in specific neurocognitive processes it could be advised to use more targeted laboratory tasks in the future, which are especially designed to measure singularly cognitive functions.For risky decision making a good example of such a task could be the Balloon Analogue Risk Task (BART) [62], a computerised task in which participants pump up a balloon on a screen by pressing a key.For each pump money is rewarded, but the larger the balloon grows, the greater the risk becomes that it pops, resulting in a loss of money.The number of pumps is indicative of risk-taking behaviour.Compared to the IGT this task is less complicated: it is not ambiguous and provides only one response option.Although both tasks are supposed to measure similar processes, they appear to assess different aspects of decision making [63], probably due to a different learning process during the task [64].
Related to this issue is the question what the role of general intelligence was in the results on the WCST and IGT in the present study.Besides having more difficulty on cool and hot EF tasks than non-prisoners, prisoners also had lower mean IQ-scores than non-prisoners based on the Raven Standard Progressive Matrices.These group differences in EF performance disappeared when correcting for those intellectual differences.This is not a new finding.In a large meta-analysis on the relationship between antisocial traits and executive disfunctioning larger differences in EF between antisocial and control groups appeared to be related to larger group differences in IQ [11].Although this finding may be meaningful, it should also be interpreted with some caution.First of all, as a result of the matching process in the present study the group sizes were reduced considerably, resulting in a large decline in statistical power.It could therefore be that a true (small) effect was unabatedly present but could no longer be detected as a result of reduced power, leading to a type II error.Second and moreover, it is important to consider how the constructs of EF and intelligence relate to each other: can they be seen as separate constructs?Ardila [65] has recently proposed that executive functions should be regarded as containing two domains: one domain he calls 'metacognitive executive functions', which include for example working memory, problem solving, planning, abstract reasoning and strategy development and implementation.The other domain he calls 'emotional/motivational functions', which are responsible for coordinating cognition and emotion.These two concepts represent 'cool' and 'hot' EF in essence, respectively.Furthermore, Ardila [65] suggests that intelligence is related to the metacognitive ('cool') EF, but not to emotional/motivational ('hot') EF.When drawing a parallel to the present study, this implies that intellectual differences between the study groups could be at least partially related to the found difference in WCST-performance, but not to that of the IGT.In line with this, results of previous studies have shown that the relationship between performance on the WCST and general intelligence is inconclusive [66,67], while there appears to be no meaningful relationship between performance on the IGT and general intelligence [46].
The third and final hypothesis in this study was that the relationship between reactive (not proactive) aggression and poor performance on hot EF-measures would be influenced by levels of testosterone/cortisol, but that this influence would be absent in the cool-EFreactive aggression relationship.Due to the fact that only a relatively small number of saliva samples could be collected, this relationship could not be investigated in the regression model.However, we were able to assess in a more exploratory manner if there were any relationships between the testosterone/cortisol ratio and measures of hot and cool EF on the one hand, and aggression on the other hand.Since these findings were based on a relative small number of saliva samples, these results should be regarded as preliminary.In line with expectations, a ratio of high endogenous testosterone levels relative to cortisol levels was significantly correlated to worse performance on the IGT (representing hot EF) and to higher levels of self-reported aggression.Interestingly, when looking at the single scale scores of the self-report questionnaires separately, none of these significantly correlated to the mean testosterone/cortisol ratio, so it is hard to point out whether or not this is an artefact and, if not, what this relationship exactly may characterize.No such relation was found for WCST variables (representing cool EF), observed aggressive behaviour or total number of convictions for violent crimes.The found significant correlations between endogenous testosterone/cortisol levels and IGTperformance and self-reported aggression were weak to moderate.This suggests a meaningful relationship (even though the correlation in itself does not prove causality).Some critical remarks are in place here, however.First of all, a very recent meta-analysis and review both provide little support for the dual hormone hypothesis in relation to status driven behaviour, including aggression and risk taking [68,69], which underlines the importance of interpreting the present findings, that were based on a small sample, with great reserve.Furthermore, it should be noted that investigating testosterone and cortisol interactions is not statistically equal to a calculation of the testosterone/cortisol ratio, since the latter suggests a linear relation, while the dual hormone hypothesis reflects a model in which testosterone only negatively influences behaviour when cortisol is low [68].Finally, in the present study was not assessed how high participants rated their current stress-levels.Since a prison-setting can be stressinducing, just as participation to a scientific study can be, and increased stress-levels are accompanied by higher cortisol levels [70,71], it could be that our results were confounded by elevated cortisol levels.
In contrast to the findings related to endogenous testosterone/cortisol levels, no significant correlations were found for EF or aggression measures and prenatal testosterone exposure, which were determined by measuring the ratio between the index and ring finger length, the so called 2D-4D ratio [52].Results from a recent meta-analysis confirm this finding [72].Even though this method had been used before in similar research [73], there appears to be only a small effect size in the relation between prenatal testosterone exposure and aggression and risk taking later on in life [72,74].Therefore, future studies could better focus on endogenous than prenatal testosterone levels.
Without neglecting the fact that the present small sample size should be interpreted with caution, one interesting additional finding in the present study is that the mean testosterone levels in the saliva of the prisoners, independent of cortisol values, were remarkably high (i.e.16.8 times larger) when compared to mean testosterone levels in a large group of males in a prior study [50].The values in the latter group were based on the same saliva sampling method as in the present study and therefore this difference cannot be explained by methodological differences in instruments.These present findings are not completely in line with those of an earlier study, which revealed that testosterone levels in prisoners were in the normal range, although testosterone levels were significantly higher in prisoners of violent crimes as opposed to committers of non-violent crimes [75].Unfortunately, no saliva samples were obtained from the present control group to make an accurate comparison with.This is something future studies should consider doing.
Based on the foregoing, some other advices can be provided as well for future directions in research on aggression, EF and biomarkers.First of all, the focus on the testosterone/cortisol ratio seems promising in both decision making and aggression.Unfortunately, the present sample was too small to draw more definite conclusions.Future studies with larger samples (based on accurate power calculations) should point out if this finding can be replicated and whether it bears clinical relevance.Selecting more specific laboratory tasks to assess separate cognitive functions related to EF may lead to more insight in the exact related cognitive processes.Also, a more valid assessment of reactive aggression or disinhibition could provide more insight in specific characteristics related to aggression and more general antisocial characteristics.
Ultimately, the relevance of all of this scientific knowledge is dependent on the degree of transferability to clinical practice.Even though there are relatively stable individual differences in both hot and cool EF across the lifespan, there is also growing evidence that both types of EF are malleable and can be improved in non-forensic populations through training, independent of age [1,76].This concerns, for example, working memory training [76][77][78] or other process-based EF training procedures that target general capacities such as inhibition or mental flexibility [76], multi-domain training (e.g.video-game training) [79][80][81] and strategy-based training [82], as well as indirect approaches such as intense physical exercise [83] and music training [84].It must be said, however, that positive training effects do not always remain over time, the generalizability to other cognitive domains is limited and transfer to daily life still remains unclear [76,78].If EF can be improved through such procedures in offender populations remains to be investigated.Especially since a recent study found a relationship between performance on the WCST and future recidivism [85], it appears to be important to target EF in interventions in this group, for example by combining EF-training with traditional cognitive behavioural treatment procedures.And thus, EF training may ultimately prove to be an extra tool in recidivism reduction.

Conclusion
The present study confirmed that prisoners show significant problems in cool EF (planning, strategic/logical reasoning, evaluating) compared to non-prisoners.The results also showed that prisoners tend to show more problems in hot EF (they take more risk in decision making and learn less from errors) than non-prisoners.These problems are not clearly related to characteristics of aggression in this sample of prisoners.An interesting preliminary finding in line with our expectations is that, in spite of the lack of a direct relationship between hot EF and aggression, both of these factors were positively correlated to having a combination of high endogenous testosterone and low cortisol levels.This is one of the first studies to have assessed all these factors altogether in one forensic sample.

Figure 1 :
Figure 1: The learning curve on the IGT, representing mean IGT scores on the five consecutive blocks of card draws, for prisoners (n = 123) and controls (n = 30), as well as the mean Net Total IGT score.Error bars display standard deviations.Higher scores represent more safe choices.

Table 1 :
Descriptive data of the study population.

Table 2 :
Linear model predictors of mean self-reported aggression, mean staff-observed aggression (SDAS), and lifetime number of convictions for violent crimes (violent crimes) within the offender sample (N = 122 1 ).The R 2 values for the predictors in each model collectively were .13,.07 and .13respectively.