In this work the task is to use the available measurements to estimate unknown hyper-parameters (variance, smoothness parameter and covariance length) of the covariance function. We do it by maximizing the joint log-likelihood function. This is a non-convex and non-linear problem. To overcome cubic complexity in linear algebra, we approximate the discretised covariance function in the hierarchical (ℋ-) matrix format. The ℋ-matrix format has a log-linear computational cost and storage O(knlogn), where rank k is a small integer. On each iteration step of the optimization procedure the covariance matrix itself, its determinant and its Cholesky decomposition are recomputed within ℋ-matrix format. (© 2016 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim)