## Chapter Objectives

The objective of this chapter is to estimate an optimal $$\Psi$$-specific treatment regime in the setting where there are $$K \gt 1$$ decision points at which treatment selection will take place. In the nomenclature of potential outcomes the value is defined as

$\mathcal{V}(d) = E\left\{Y^{\text{*}}(d)\right\},$

where $$Y^{\text{*}}(d)$$ is the potential outcome that an individual would achieve if all $$K$$ rules in $$d$$ were followed to select treatment. A regime that satifies

$E\left\{Y^{\text{*}}(d^{opt})\right\} \ge E\left\{Y^{\text{*}}(d)\right\} \textrm{for all d} \in \mathcal{D}$

is termed an optimal regime.

Here, we provide example analyses using the Q-Learning, classification, value search, and backward outcome weighted learning approaches discussed in Chapter 7. These estimators for $$d^{opt}$$ have been implemented in R package DynTxRegime. This package is freely available through the repository maintained by R, CRAN.