Skip to main content
  • Home
  • Happenings
  • Events
  • ''EFFECT OF DESIGN RESOLUTION AND RESPONSE TYPE ON SAMPLE SIZE DETERMINATION UNDER AN ONLINE EXPERIMENTAL FRAMEWORK ''
''EFFECT OF DESIGN RESOLUTION AND RESPONSE TYPE ON SAMPLE SIZE DETERMINATION UNDER AN ONLINE EXPERIMENTAL FRAMEWORK ''

''EFFECT OF DESIGN RESOLUTION AND RESPONSE TYPE ON SAMPLE SIZE DETERMINATION UNDER AN ONLINE EXPERIMENTAL FRAMEWORK ''

Date11th Apr 2022

Time10:00 AM

Venue Webex Link

PAST EVENT

Details

Keywords: Online experiments; Design of experiments; Sample size; Optimization; Sensitivity; Simulation; Bayesian approach
Experiments play an important role in understanding and improving a given system by systematically manipulating the input variables. They can be classified based on various factors such as adaptability, the degree of control on the environment, and the destination of its end product/service. In general, these factors, alongside the research questions pertinent to a particular analysis, help in deciding the type of experiment to be chosen for that analysis. Adopting the wrong kind of experiment might affect the internal and external validity of the research. The key focus of any type of experiment is to make inferences about the parameters used to either understand/optimize the system, or compare different treatments and choose the superior one among them. One of the crucial parameters affecting the quality of inferences made from an experiment is the sample size, which influences the reliability of the estimator(s) computed from the experimental data. Even though the reliability of the estimators increases along with the sample size, it is not always beneficial to increase sample sizes as each sample is associated with some cost. A specific instance of this, relevant to this thesis, can be seen in the online setting where the cost of experimenting is conceptualized as the sub-optimal treatment allocation to units. In this work, we have built on the experimental setting proposed in Sudarsanam and Ravindran (2018). This work considers a two-phased, online setup with a finite horizon for two-level experiments, and has subsequently been expanded to multi-factor, full factorial designs. This thesis seeks to significantly widen the usability of experiments in such settings through three specific studies. These can be broadly summarized as expansions through fractional designs, exploring a range of distributional assumptions, and finally conducting studies in the robustness of solutions. These constitute the three major contributions of the thesis. The first set of studies broadens the scope of using experiments in such an environment by analyzing fractional factorial designs, with various standard aliasing patterns. Unlike the full factorial design, the experimental noise in the system and the key elements that aid in generating the fractional factorial design, including the design generator and defining relations, cannot be generalized due to its confounding/aliasing nature. Hence it becomes vital to analyze each fractional factorial design generated individually. The implementations include a Bayesian analysis of all the commonly used fractions of two-level designs for seven factors or less. Here, the cumulative improvement is used as a performance measure to optimize the Bayesian framework in an online setting. The results are then validated by hierarchical probability model (HPM) simulation, which also aids in capturing different scenarios which are not examined by the theoretical results. The theoretical results are then compared with the baseline performance, which suggests that the cumulative improvement obtained by the theoretical results is at least 85% greater than the expected cumulative improvement of experimentation with a random number of replicates. Furthermore, the percentage of resources required for the experiment decreases with the increase in either the total number of available trials or the signal-to-noise ratio or both of these parameters.
In the second set of studies that constitute the thesis, the study explores three discrete distributions - binomial, hypergeometric, and Poisson, as the potential response variable, in addition to the Gaussian priors of the base model. The binomial and hypergeometric distributions model the number of successes as a random variable under two different sampling strategies, which assist in examining the impact of choosing one strategy over the other. The comparison shows that the performance measure of the hypergeometric case can act as a tight upper bound for the binomial case when the population size is small. Along the same lines, the Poisson distribution models the count data type, for which the theoretical result is a function of the finite horizon and the rate parameter of the exponential distribution. Overall, the insights related to the parameters of discrete distributions align with the base model, in which the sample size increases along with the size of the horizon and decreases with the signal-to-noise ratio. For both continuous and discrete cases, the theoretical results show that the optimal sample size depends on three critical parameters, which include the signal-to-noise ratio, the size of the finite set, and the number of factors considered. Hence, it is essential to understand the effect of varying these parameters on the optimal sample size, which is carried out through a sensitivity analysis. Finally, in the third set of studies that constitute the thesis, we choose to analyze the sensitivity of these parameters through mathematical and graphical methods, as the relation between the parameters and the optimal sample size is deterministic. This analysis assists in providing the parameter’s ranking based on their importance, the sensitive regions in the parameter space, and the critical points of the parameters. These insights can be used to suggest recommendations to the practitioners that should be considered while choosing the parameter values for their setting.

Speakers

Mr. P. BALAJI Roll No. MS15D004

DEPARTMENT OF MANAGEMENT STUDIES