Examples of the disclosure efficiently processing data sets. In some examples, a plurality of first processor elements process a first data set (e.g., an image) and a second data set (e.g., a kernel) using a first function to generate a third data set. The third data set is processed using a second function to generate an output element. The first processor elements are arranged in a two-dimensional systolic array such that one or more first processor elements receive input from a first adjacent first processor element and transmit output to a second adjacent first processor element. A plurality of second processor elements aggregate the output element to at least partially generate a fourth data set. The plurality of second processor elements arranged in a one-dimensional array. Aspects of the disclosure facilitate increasing speed, conserving memory, reducing processor load or an amount of energy consumed, and/or reducing network bandwidth usage.