The algorithm is pretty well calibrated at predicting gender from personality, though it seems a little overconfident at the high end! Here is a chart I made showing in red how confident the algorithm was (x-axis) vs. accuracy. Blue is the best possible.