What causes diffusion to occur?

Differences in the concentration of a component result in a net difference in the Gibbs free energy of the molecules of that component in the more concentrated region and the more dilute region. That difference provides a driving force for the molecules in the more concentrated region to migrate towards the more dilute region while there is less driving force for the molecules in the dilute region to migrate against the gradient, resulting in a net effect of diffusion. Note that if the Gibbs free energy can be raised in one region compared to another by some other means, such as increasing the pressure on one side of a membrane, you can actually get diffusion against the concentration gradient. That's how reverse osmosis works.

Diffusion is the result of random movement of atoms and molecules.

A concentration gradient means a favorable difference in Gibbs free energies between the more concentrated region and the less concentrated region of the diffusing molecules - which provides a driving force to equalize the concentrations and eliminate the difference in concentration.
Another way to look at it is that when there is a concentration gradient there are more molecules of the diffusing material in the concentrated region than in the less concentrated region, thus as they bang around it is more likely for a molecule of the diffusing material to move into the less concentrated region than for a molecule of the diffusing material to move into the more concentrated region.
Note that concentration gradients can be sustained or even created by imposing other forces on a system that override the probability of more numerous molecules moving from a region of concentration into a region of less concentration. A good example is when a magnetic or electrical field is imposed on a solution or gas with charged or magnetically susceptible molecules or particles. When that is done, the probability of movement in one direction vs the other is biased by the attractive and/or repulsive forces of the imposed field. More subtly, a temperature gradient can alter the Gibbs free energies of different regions to make it more or less favorable for a molecule to remain in one temperature region than in the other. Likewise the Gibbs free energy can be altered by pressure - which is how reverse osmosis works.