When training a neural network, the goal is to make it model a target function h(x) . If you add the input x to the output of the network (i.e., you add a skip connection), then the network will be forced to model f(x)=h(x)–x rather than h(x) . This is called residual learning.