Most database textbooks expose the reader to basic relational algebra and calculi. These languages offer a foundation for understanding query processing and optimization, and for understanding SQL. However, Codd's original relational algebra and calculus, as well as those presented in many textbooks, fall short in addressing aggregate computation, which, with the rise of OLAP, Data Warehousing, and Business Intelligence, has recently gained increased prominence.
As early as 1980, Anthony Klug realized the importance of aggregate functions. (Had this paper been written in the mid-to-late 1990's, it would have been a data warehousing paper!) At that time, the uses of aggregate functions in relational query languages were not well understood. For example, System R and Ingres used "sets" of tuples having duplicates when defining aggregates in their languages. This paper represents an early and very substantial step forward in aggregate computation.
In a few, precisely written, and easily read pages, the paper defines aggregate functions and then elegantly introduces these in the relational algebra using an aggregate formation operator. This is the part of the paper I like the most and the one that I recommend as still being a great introduction to relational algebra with aggregate functions. Following this part, the paper defines the corresponding calculus is defined. Finally, the two languages are proven equivalent.