Easily perform A/B testing from within WordPress itself.
You should wait until the experiment has reached at least a 95% confidence level with statistically significant results, and ideally you should wait until the experiment reaches 99% confidence.
The higher the statistical confidence, the more likely the results will be statistically significant. For sites with high traffic, the time it takes for experiments to reach statistical significance will be much shorter than sites with lower traffic. Making decisions about your experiment results before the proper thresholds have been reached may result is false-positives that lead to lower-than-expected conversion rates.