The apparent reliance on data and experimental evidence in system modeling, decisionmaking, pattern recognition, and control engineering, just to enumerate several representative spheres of interest, entails the centrality of data and emphasizes their paramount role in data science. To capture the essence of data, facilitate building their essential descriptors and reveal key relationships, as well as having all these faculties realized in an efficient manner as well as deliver transparent, comprehensive, and useroriented results, we advocate a genuine need for transforming data into information granules. In the realized setting, information granules become regarded as conceptually sound knowledge tidbits over which various models could be developed and utilized.
A tendency, which is being witnessed more visibly nowadays, concerns human centricity. Data science and big data revolve around a twoway efficient interaction with users. Users interact with data analytics processes meaning that the terms such as data quality, actionability, transparency are of relevance and are provided in advance. With this regard, information granules emerge as a sound conceptual and algorithmic vehicle owing to their way of delivering a more general view at data, ignoring irrelevant details and supporting a suitable level of abstraction aligned with the nature of the problem at hand.
Our objective is to provide a general overview of Granular Computing, identify the main items on its agenda and associate their usage in the setting of data analytics. To organize our discussion in a coherent way and highlight the main threads as well as deliver a selfcontained material, the study is structured in a topdown manner. Some introductory material offering some motivating insights into the existing formalisms is covered in Section Ⅱ. Section Ⅲ is devoted to the design of information granules with a delineation of the main directions. The principle of justifiable granularity is presented in depth including both its generic version and a number of essential augmentations. The shift of paradigm implied by the involvement of information granules is covered in Section Ⅳ; here a thorough discussion of the main directions building a diversified landscape Granular Modeling and Data Analytics is presented. Passive and active aggregation mechanisms required in the realization of distributed data analysis are included in Section Ⅴ.
Ⅱ. INFORMATION GRANULES AND INFORMATION GRANULARITYThe framework of Granular Computing along with a diversity of its formal settings offers a critically needed conceptual and algorithmic environment. A suitable perspective built with the aid of information granules is advantageous in realizing a suitable level of abstraction. It also becomes instrumental when forming sound and pragmatic problemoriented tradeoffs among precision of results, their easiness of interpretation, value, and stability (where all of these aspects contribute vividly to the general notion of actionability).
Information granules are intuitively appealing constructs, which play a pivotal role in human cognitive and decisionmaking activities (Bargiela and Pedrycz [1], [2]; Zadeh [3], [4]). We perceive complex phenomena by organizing existing knowledge along with available experimental evidence and structuring them in a form of some meaningful, semantically sound entities, which are central to all ensuing processes of describing the world, reasoning about the environment, and support decisionmaking activities.
The terms information granules and information granularity themselves have emerged in different contexts and numerous areas of application. Information granule carries various meanings. One can refer to Artificial Intelligence (AI) in which case information granularity is central to a way of problem solving through problem decomposition, where various subtasks could be formed and solved individually. Information granules and the area of intelligent computing revolving around them being termed Granular Computing are quite often presented with a direct association with the pioneering studies by Zadeh [3]. He coined an informal, yet highly descriptive and compelling concept of information granules. Generally, by information granules one regards a collection of elements drawn together by their closeness (resemblance, proximity, functionality, etc.) articulated in terms of some useful spatial, temporal, or functional relationships. Subsequently, Granular Computing is about representing, constructing, processing, and communicating information granules. The concept of information granules is omnipresent and this becomes well documented through a series of applications, cf. (Leng et al. [5]; Loia et al. [6]; Pedrycz and Bargiela [7]; Pedrycz and Gacek [8]; Zhou et al. [9]).
Granular Computing exhibits a variety of conceptual developments; one may refer here to selected pursuits:
graphs (Wang and Gong [10]; Chiaselotti et al. [11]; Pal et al. [12])
information tables (Chiaselotti et al. [13])
mappings (Salehi et al. [14])
knowledge representation (Chiaselotti et al. [15])
micro and macro models (Bisi et al. [16])
association discovery and data mining (Honko [17]; Wang et al. [18])
clustering (Tang et al. [19]) and rule clustering (Wang et al. [20])
classification (Liu et al. [21]; Savchenko [22])
There are numerous applications of Granular Computing, which are reported in recent publications:
Forecasting time series (Singh and Dhiman [23]; Hryniewicz and Karczmarek [24])
Prediction tasks (Han et al. [25])
Manufacturing (Leng et al. [5])
Concept learning (Li et al. [26])
Perception (Hu et al. [27])
Optimization (MartinezFrutos et al. [28])
Credit scoring (Saberi et al. [29])
Analysis of microarray data (Ray et al. [30]; Tang et al. [31])
It is again worth emphasizing that information granules permeate almost all human endeavors. No matter which problem is taken into consideration, we usually set it up in a certain conceptual framework composed of some generic and conceptually meaningful entities — information granules, which we regard to be of relevance to the problem formulation, further problem solving, and a way in which the findings are communicated to the community. Information granules realize a framework in which we formulate generic concepts by adopting a certain level of generality.
Information granules naturally emerge when dealing with data, including those coming in the form of data streams. The ultimate objective is to describe the underlying phenomenon in an easily understood way and at a certain level of abstraction. This requires that we use a vocabulary of commonly encountered terms (concepts) and discover relationships between them and reveal possible linkages among the underlying concepts.
Information granules are examples of abstractions. As such they naturally give rise to hierarchical structures: the same problem or system can be perceived at different levels of specificity (detail) depending on the complexity of the problem, available computing resources, and particular needs to be addressed. A hierarchy of information granules is inherently visible in processing of information granules. The level of captured details (which is represented in terms of the size of information granules) becomes an essential facet facilitating a way a hierarchical processing of information with different levels of hierarchy indexed by the size of information granules.
Even such commonly encountered and simple examples presented above are convincing enough to lead us to ascertain that (a) information granules are the key components of knowledge representation and processing, (b) the level of granularity of information granules (their size, to be more descriptive) becomes crucial to the problem description and an overall strategy of problem solving, (c) hierarchy of information granules supports an important aspect of perception of phenomena and deliver a tangible way of dealing with complexity by focusing on the most essential facets of the problem, (d) there is no universal level of granularity of information; commonly the size of granules is problemoriented and user dependent.
Humancentricity comes as an inherent feature of intelligent systems. It is anticipated that a twoway effective humanmachine communication is imperative. Human perceive the world, reason, and communicate at some level of abstraction. Abstraction comes hand in hand with nonnumeric constructs, which embrace collections of entities characterized by some notions of closeness, proximity, resemblance, or similarity. These collections are referred to as information granules. Processing of information granules is a fundamental way in which people process such entities. Granular Computing has emerged as a framework in which information granules are represented and manipulated by intelligent systems. The twoway communication of such intelligent systems with the users becomes substantially facilitated because of the usage of information granules.
It brings together the existing plethora of formalisms of set theory (interval analysis) under the same banner by clearly visualizing that in spite of their visibly distinct underpinnings (and ensuing processing), they exhibit some fundamental commonalities. In this sense, Granular Computing establishes a stimulating environment of synergy between the individual approaches. By building upon the commonalities of the existing formal approaches, Granular Computing helps assemble heterogeneous and multifaceted models of processing of information granules by clearly recognizing the orthogonal nature of some of the existing and well established frameworks (say, probability theory coming with its probability density functions and fuzzy sets with their membership functions). Granular Computing fully acknowledges a notion of variable granularity, whose range could cover detailed numeric entities and very abstract and general information granules. It looks at the aspects of compatibility of such information granules and ensuing communication mechanisms of the granular worlds. Granular Computing gives rise to processing that is less time demanding than the one required when dealing with detailed numeric processing.
A. Frameworks of Information GranulesThere are numerous formal frameworks of information granules; for illustrative purposes, we recall some selected alternatives.
Sets (intervals) realize a concept of abstraction by introducing a notion of dichotomy: we admit element to belong to a given information granule or to be excluded from it. Along with the set theory comes a welldeveloped discipline of interval analysis (Alefeld and Herzberger [32]; Moore [33]; Moore et al. [34]).
Fuzzy sets deliver an important conceptual and algorithmic generalization of sets (Dubois and Prade [35][37]; Klir and Yuan [38]; Nguyen and Walker [39]; Pedrycz et al. [40]; Pedrycz and Gomide [41]; Zadeh [42][44]). By admitting partial membership of an element to a given information granule, we bring an important feature which makes the concept to be in rapport with reality. It helps working with the notions, where the principle of dichotomy is neither justified, nor advantageous.
Fuzzy sets come with a spectrum of operations, usually realized in terms of triangular norms (Klement et al. [45]; Schweizer and Sklar [46]).
Shadowed sets (Pedrycz [47], [48]) offer an interesting description of information granules by distinguishing among three categories of elements. Those are the elements, which (ⅰ) fully belong to the concept, (ⅱ) are excluded from it, (ⅲ) their belongingness is completely unknown.
Rough sets (Pawlak [49][52], Pawlkak and Skowron [53]) are concerned with a roughness phenomenon, which arises when an object (pattern) is described in terms of a limited vocabulary of certain granularity. The description of this nature gives rise to a socalled lower and upper bounds forming the essence of a rough set.
The list of formal frameworks is quite extensive; as interesting examples, one can recall here probabilistic sets (Hirota [54]) and axiomatic fuzzy sets (Liu and Pedrycz [55]).
There are two important directions of generalizations of information granules, namely information granules of higher type and information granules of higher order. The essence of information granules of higher type means that the characterization (description) of information granules is described in terms of information granules rather than numeric entities. Wellknown examples are fuzzy sets of type2, granular intervals, imprecise probabilities. For instance, a type2 fuzzy set is a fuzzy set whose grades of membership are not single numeric values (membership grades in
Along with a truly remarkable diversity of detailed algorithms and optimization mechanisms of clustering, the paradigm itself leads to the formation of information granules (associated with the ideas and terminology of fuzzy clustering, rough clustering, and others) and applies both to numeric data and information granules. Information granules built through clustering are predominantly datadriven, viz. clusters (either in the form of fuzzy sets, sets, or rough sets) are a manifestation of a structure encountered (discovered) in the data.
Numeric prototypes are formed through invoking clustering algorithms, which yield a partition matrix and a collection of the prototypes. Clustering realizes a certain process of abstraction producing a small number of the prototypes based on a large number of numeric data. Interestingly, clustering can be also completed in the feature space. In this situation, the algorithm returns a small collection of abstracted features (groups of features) that might be referred to as metafeatures.
Two ways of generalization of prototypes treated as key descriptors of data and manageable chunks of knowledge are considered: (ⅰ) symbolic and (ⅱ) granular. In the symbolic generalization, one moves away from the numeric values of the prototypes and regards them as sequences of integer indexes (labels). Along this line, developed are concepts of (symbolic) stability and (symbolic) resemblance of data structures. The second generalization motivates the buildup of granular prototypes, which arise as a direct quest for a more comprehensive representation of the data than the one delivered through numeric entities. This entails that information granules (including their associated level of abstraction), have to be prudently formed to achieve the required quality of the granular model.
As a consequence, the performance evaluation embraces the following sound alternatives: (ⅰ) evaluation of representation capabilities of numeric prototypes, (ⅱ) evaluation of representation capabilities of granular prototypes, and (ⅲ) evaluation of the quality of the granular model
In the first situation, the representation capabilities of numeric prototypes are assessed with the aid of a socalled granulationdegranulation scheme yielding a certain reconstruction error. The essence of the scheme can be schematically portrayed as
1) encoding leading to the degrees of activation of information granules by input
$ \begin{equation} A_i({\pmb x})=\dfrac{1} {\displaystyle\sum\limits^{c}_{j=1}\left(\frac{\parallel{{\pmb x}{\pmb v}_i}\parallel} {\parallel{{\pmb x}{\pmb v}_j}\parallel}\right)^{\frac{2}{m1}}} \end{equation} $ 
in case the prototypes are developed with the use of the Fuzzy CMeans (FCM) clustering algorithm, the parameter
2) degranulation producing a reconstruction of
$ \begin{equation} \hat {{\pmb x}}=\frac{\displaystyle\sum\limits^{c}_{i=1} A^{m}_{i}{({\pmb x}){\pmb v}_i}} {\displaystyle\sum\limits^{c}_{i=1} A^{m}_{i}({\pmb x})}. \end{equation} $  (2) 
It is worth stressing that the above stated formulas are a consequence of the underlying optimization problems. For any collection of numeric data, the reconstruction error is a sum of squared errors (distances) of the original data and their reconstructed versions.
B. The Principle of Justifiable GranularityThe principle of justifiable granularity guides a construction of an information granule based on available experimental evidence. In a nutshell, a resulting information granule becomes a summarization of data (viz. the available experimental evidence). The underlying rationale behind the principle is to deliver a concise and abstract characterization of the data such that (ⅰ) the produced granule is justified in light of the available experimental data, and (ⅱ) the granule comes with a welldefined semantics meaning that it can be easily interpreted and becomes distinguishable from the others.
Formally speaking, these two intuitively appealing criteria are expressed by the criterion of coverage and the criterion of specificity. Coverage states how much data are positioned behind the constructed information granule. Put it differently  coverage quantifies an extent to which information granule is supported by available experimental evidence. Specificity, on the other hand, is concerned with the semantics of information granule stressing the semantics (meaning) of the granule.
The definition of coverage and specificity requires formalization and this depends upon the formal nature of information granule to be formed. As an illustration, consider an interval form of information granule
$ \begin{equation} {\rm cov}(A)={\frac{1}{N}}card\{{x_k}{{x_k}\in{A}} \} \end{equation} $  (3) 
$ \begin{equation} sp(A)=g(length(A))=1\frac{ba}{range} \end{equation} $  (4) 
where range stands for an entire space over which intervals are defined.
If we consider a fuzzy set as a formal setting for information granules, the definitions of coverage and specificity are reformulated to take into account the nature of membership functions admitting a notion of partial membership. Here we invoke the fundamental representation theorem stating that any fuzzy set can be represented as a family of its
$ A(x)=\textrm{sup}_{\alpha\in[0, 1]}\left[\min(\alpha, A_{\alpha}(x))\right] $ 
where
$ \begin{equation} A_{\alpha}(x)=\left\{xA(x)\geq{\alpha}\right\}. \end{equation} $  (5) 
The supremum (sup) is taken over all values of
Having this in mind and considering (3) as a point of departure for constructs of sets (intervals), we have the following relationships
1) coverage
$ \begin{equation} {\rm cov}(A)=\frac{1}{N}\int_{\pmb{X}}{A(x)dx} \end{equation} $  (6) 
where
2) specificity
$ \begin{equation} sp(A)=\int_{0}^{1}sp(A_\alpha)da \end{equation} $  (7) 
Note that (6)  (7) directly generalize (3). The onedimensional case can be extended to the multidimensional situation; here the count of data falling within the bounds of the information granule involves some distance function, namely
$ \begin{equation} {\rm cov}(A)=\frac{1}{N}card \left\{{\pmb x}_k\{\pmb x}_k{\pmb v}\\leq \rho\right\} \end{equation} $  (8) 
where
The key objective is to build an information granule so that it achieves the highest value of coverage and specificity. These criteria are in conflict: increasing the coverage reduces specificity and vice versa. A viable alternative is to take the product of these two components and construct information granule so that this product attains its maximal value. In other words, the optimal values of the parameters of information granule, say,
$ \begin{equation} {{{\pmb w}}}_{opt}={\rm arg}\textit{Max}_{{{{\pmb w}}}}\left[{\rm cov}({{{\pmb w}}})sp({{{\pmb w}}})\right]. \end{equation} $  (9) 
Note that both the coverage and specificity are functions of
From the perspective of algorithmic developments, the construction of information granules embraces two phases. First a numeric representative of experimental evidence is formed (usually considering some well known alternatives such as a mean, weighed mean, median, etc.). Second, an information granule is spanned over this initial numeric representative and the development of an interval, fuzzy set, etc. is guided by the maximization of the product of the coverage and specificity.
As a way of constructing information granules, the principle of justifiable granularity exhibits a significant level of generality in two essential ways. First, given the underlying requirements of coverage and specificity, different formalisms of information granules can be engaged. Second, experimental evidence could be expressed as information granules in different formalisms and on this basis certain information granule is being formed.
It is worth stressing that there is a striking difference between clustering and the principle of justifiable granularity. First, clustering leads to the formation at least two information granules (clusters) whereas the principle of justifiable granularity produces a single information granule. Second, when positioning clustering and the principle visàvis each other, the principle of justifiable granularity can be sought as a followup step facilitating an augmentation of the numeric representative of the cluster (such as e.g., a prototype) and yielding granular prototypes where the facet of information granularity is.
So far, the principle of justifiable granularity presented is concerned with a generic scenario meaning that experimental evidence gives rise to a single information granule. Several conceptual augmentations are considered where several sources of auxiliary information is supplied:
Involvement of auxiliary variable. Typically, these could be some dependent variable one encounters in regression and classification problems. An information granule is built on a basis of experimental evidence gathered for some input variable and now the associated dependent variable is engaged. In the formulation of the principle of justifiable granularity, this additional information impacts a way in which the coverage is determined. In more detail, we discount the coverage; in its calculations, one has to take into account the nature of experimental evidence assessed on a basis of some external source of knowledge. In regression problems (continuous output/dependent variable), in the calculations of specificity, we consider the variability of the dependent variable
$ \begin{equation} {\rm cov}'(A)={\rm cov}(A){\rm exp}(\beta\sigma^{2}_{y}) \end{equation} $  (10) 
where
In case of a classification problem in which
$ \begin{equation} {\rm cov}'(A)={\rm cov}(A)(1h(\omega)). \end{equation} $  (11) 
This expression penalizes the diversity of the data contributing to the information granule and not being homogeneous in terms of class membership. The higher the entropy, the lower the coverage cov
Information granules are described through numeric parameters (or eventually granular parameters in case of information granules of higher type). There is an alternative view at a collection of information granules where we tend to move away from numeric details and instead look at the granules as symbols and engage them in further symbolic processing. Interestingly, symbolic processing is vividly manifested in Artificial Intelligence (AI).
Consider a collection of information granules
Once this phase has been completed,
$ \begin{equation} d(i, j, k)=\dfrac{i_kj_k}{c_k} \end{equation} $  (12) 
where
$ \begin{equation} \textit{Sim}(A_i, A_j)=\frac{1}{P}\sum\limits_{k\in{R_{ij}}}(1\frac{i_kj_k}{c_k}) \end{equation} $  (13) 
where
When coping with spatiotemporal data, (say time series of temperature recorded over in a given geographic region), a concept of spatiotemporal probes arises as an efficient vehicle to describe the data, capture their local nature, and articulate their local characteristics as well as elaborate on their abilities as modeling artifacts (building blocks). The experimental evidence is expressed as a collection of data
Here an information granule is positioned in a threedimensional space:
(ⅰ) space of spatiotemporal variable
(ⅱ) spatial position defined by the positions
(ⅲ) temporal domain described by time coordinate
The information granule to be constructed is spanned over some position of the space of values z
$ \begin{equation} {\rm cov}(A)=card\{z(x_k, y_k, t_k)\z(x_k, y_k, t_k)z_0\\leq\rho\} \end{equation} $  (14) 
$ \begin{align} sp(A)&=\max(0, 1\dfrac{zz_0}{L_z})\\ sp_{x}&=\max(0, 1\dfrac{xx_0}{L_x})\\ sp_{y}&=\max(0, 1\dfrac{yy_0}{L_y})\\ sp_{t}&=\max(0, 1\dfrac{tt_0}{L_t}) \end{align} $  (15) 
where the above specifity measures are monotonically decreasing functions (linear functions in the case shown above). There are some cutoff ranges (
The results of clustering coming in the form of numeric prototypes
$ \begin{equation} \rho_{i, opt}={\rm arg}{\textit{Max}}_{\rho_i}[{\rm cov}(V_i)sp(V_i)] \end{equation} $  (16) 
where
$ \begin{align} {\rm cov}(V_i)&=\frac{1}{N}card\left\{{\pmb x}_k\{\pmb x}_k{\pmb v}_i\\leq\rho_i\right\}\\ {\rm cov}(V_i)&=1\rho_i \end{align} $  (17) 
assuming that we are concerned with normalized data.
It is worth noting that having a collection of granular prototypes, one can conveniently assess their abilities to represent the original data (experimental evidence). The reconstruction problem, as outlined before for numeric data, can be formulated as follows: given
$ \begin{align} \hat{{{{\pmb v}}}}&=\dfrac{\displaystyle\sum\limits^c_{i=1}A^m_i(x){\pmb v}_i} {\displaystyle\sum\limits^c_{i=1}A^m_i(x)}\\ \hat{\rho}&=\dfrac{\displaystyle\sum\limits^c_{i=1}A^m_i(x)\rho_i} {\displaystyle\sum\limits^c_{i=1}A^m_i(x)} \end{align} $  (18) 
The quality of reconstruction uses the coverage criterion formed with the aid of the Boolean predicate
$ \begin{align} T({{\pmb x}}_k)= \begin{cases} 1&\textrm{if}~\\hat{{{{\pmb v}}_i}}{{\pmb x}}_k\\leq\hat{\rho}\\ 0, &{\rm otherwise} \end{cases} \end{align} $  (19) 
and for all data one takes the sum of (19) over them. It is worth noting that in addition to the global measure of quality of granular prototypes, one can associate with them their individual quality (taken as a product of the coverage and specificity computed in the formation of the corresponding information granule).
Ⅳ. THE PARADIGM SHIFT IN DATA ANALYTICS: FROM NUMERIC DATA TO INFORMATION GRANULES AND MODELING CONSTRUCTSThe paradigm shift implied by the engagement of information granules becomes manifested in several tangible ways including (ⅰ) a stronger dependence on data when building structurefree, useroriented, and versatile models spanned over selected representatives of experimental data, (ⅱ) emergence of models at various varying levels of abstraction (generality) being delivered by the specificity/generality of information granules, and (ⅲ) building a collection of individual local models and supporting their efficient aggregation.
A functional scheme emerging as a consequence of the above discussion and advocated by the agenda of Granular Computing is succinctly outlined in Fig. 1.
Download:


Fig. 1 A landscape of Granular Modeling and Data Analytics: main design pursuits. 
Here several main conceptually and algorithmically farreaching paths are emphasized. Notably, some of them have been studied to some extent in the past and several open up new directions worth investigating and pursuing. In what follows, we elaborate on them in more detail pointing at the relationships among them.
data
data
data
data
data
data
In distributed data analysis and system modeling, the following formulation of the problem can be outlined. There are some sources of data composed of different data subsets and feature (attribute) subsets. An example is a collection of models built for a complex phenomenon where a lot of data using different features are available locally and used to construct local models. Denote these models as
Two modes of aggregation architecture are envisioned, namely a passive and an active aggregation mode.
Passive mode of aggregation. The essence of this mode is illustrated in Fig. 2.
Download:


Fig. 2 A general scheme of a passive mode of aggregation. 
Let us proceed with computing details. The models were constructed using some data. As each model comes with its performance index, we introduce a weight of the corresponding model, say
For the realization of the aggregation mechanism and its evaluation, we consider some data set
$ \begin{align} y_{1}(1), \ldots y_{p}(1), y_{1}(2), \ldots y_{p}(2), \ldots , y_{1}(N), \ldots y_{p}(N). \end{align} $  (20) 
In light of the existing weights, the above data are associated with them forming a set
$ \begin{align} &y_{1}(1), q_{1}\ldots y_{p}(1) q_{p}, y_{1}(2), q_{1}\ldots y_{p}(2), q_{p} \ldots , \\ &y_{1}(N), q_{1} \ldots y_{p}(N)~q_{p}. \end{align} $  (21) 
For any
Active mode of aggregation. In contrast to the passive aggregation mechanism used to combine models, active aggregation means that the results produced by the individual models are actively modified (Fig. 3). The essence of this active modification is to endow the aggregation process with some highly desired flexibility so that the consensus could be enhanced. While bringing in some flexibility to the results to be aggregated increase overall consensus, at the same time the quality of the local model is reduced (as we strive to improve quality of the aggregated results and are inclined to sacrifice the quality of the model). A sound balance needs to be set up so that the consensus level is increased not significantly affecting the quality of the individual models.
Download:


Fig. 3 Active mode of aggregation. 
A viable approach is to invoke here a randomization mechanism, refer to Fig. 3, see also Li and Wang [63]. It makes the numeric output of each model random as we additively include some noise of a certain intensity (standard deviation). The higher the variance, the higher the random flexibility of the model and the higher the ability to achieve higher value of consensus of results produced by the models. In other words, the original outputs of the models
$ \begin{align} q'_{i}(k)=q_{i}{\rm exp}(\dfrac{y_{i}(k)y'_{i}(k)}{y_{i, \max}y_{i, \min}}) \end{align} $  (22) 
where
$ \begin{align} &y_{1}'(1), q_{1}'(1)\ldots y_{p}'(1) q_{p}'(1), y_{1}'(2), q_{1}'(2)\ldots y_{p}'(2), q_{1} \ldots , \\ &y_{1}(N), q_{1}'(N)\ldots y_{p}'(N)~q_{p}'(N). \end{align} $  (23) 
The evaluation of the consensus built for the randomized outputs of the model is more elaborate because of the randomness factor that needs to be taken into consideration. Consider the outputs of the models for some fixed data
$ \begin{align} Y(k)={\rm arg}\textit{Max}_{F_k}[{\rm cov}(k)sp(k)]. \end{align} $  (24) 
where
$ \begin{align} T=\frac{1}{N}\sum\limits^{N}_{k=1}{\rm max}_{F_k}[{\rm cov}(k)sp(k)]. \end{align} $  (25) 
As anticipated, the higher the randomness involved (s), the higher the above average. This implies that the better aggregation outcomes are formed because of higher consistency of the outputs produced by the individual models. As noted earlier, this comes with the deterioration of the quality of the models and this effect can be observed by monitoring the values of
$ \begin{align} q=\frac{1}{p}\displaystyle\sum\limits^{p}_{i=1}q_{i}, ~~ q'=\frac{1}{Np}\displaystyle\sum\limits^{N}_{k=1} \displaystyle\sum\limits^{p}_{i=1}q_i'(k) \end{align} $  (26) 
and analyze the ratio
The above randomization scheme is one among possible alternatives and some other more refined options can be analyzed:
(ⅰ) a sound refinement could be to bring the component of randomness of varying intensity (
(ⅱ) The modification to the randomized approach could invoke the involvement of the prediction intervals that are quite often associated with the construction of the model. The random modifications of the outputs are confided to the values falling within the scope of the prediction intervals. Obviously, to follow this development path, the prediction intervals have to be provided. This is not an issue in case of some categories of models such as linear regression models (where prediction intervals are developed in a routine basis), however this could be more complicated and computationally demanding in case of models such as neural networks and rulebased models.
The randomization aspect of the output space (outputs of the models) was presented above; another option could be to consider randomization of the parameters of the models and/or a combination of randomization of the output and the parameter spaces.
Ⅵ. CONCLUDING COMMENTSThe study has offered a focused overview of the fundamentals of Granular Computing positioned in the context of data analytics and advanced system modeling. We identified a multifaceted role of information granules as a meaningful conceptual entities formed at the required level of abstraction. It has been emphasized that information granules are not only reflective of the nature of the data (the principle of justifiable granularity highlights the reliance of granules on available experimental evidence) but can efficiently capture some auxiliary domain knowledge conveyed by the user and in this way reflect the humancentricity aspects of the investigations and enhances the actionability aspects of the results. The interpretation of information granules at the qualitative (linguistic) level and their emerging characteristics such as e.g., stability enhancement of the interpretability capabilities of the framework of processing information granules is another important aspect of data analytics that directly aligns with the requirements expressed by the user. Several key avenues of system modeling based on the principles of Granular Computing were highlighted; while some of them were subject of intensive studies, some other require further investigations.
By no means, the study is complete; instead it can be regarded as a solid departure point identifying main directions of further farreaching humancentric data analysis investigations. A number of promising avenues are open that are well aligned with the current challenges of data analytics including the reconciliation of results realized in the presence of various sources of knowledge (models, results of analysis), hierarchies of findings, quantification of tradeoffs between accuracy and interpretability (transparency).
[1]  A. Bargiela and W. Pedrycz, Granular Computing: An Introduction. Dordrecht: Kluwer Academic Publishers, 2003. 
[2]  A. Bargiela and W. Pedrycz, "Toward a theory of Granular Computing for humancentered information processing". IEEE Transactions on Fuzzy Systems , vol.16, no.2, pp.320–330, 2008. DOI:10.1109/TFUZZ.2007.905912 
[3]  L. A. Zadeh, "Towards a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic". Fuzzy Sets and Systems , vol.90, no.2, pp.111–117, 1997. DOI:10.1016/S01650114(97)000778 
[4]  L. A. Zadeh, "Toward a generalized theory of uncertainty (GTU)—an outline". Information Sciences , vol.172, pp.1–40, 2005. DOI:10.1016/j.ins.2005.01.017 
[5]  J. Leng, Q. Chen, N. Mao, and P. Jiang, "Combining granular computing technique with deep learning for service planning under social manufacturing contexts". KnowledgeBased Systems , vol.143, pp.295–306, 2018. DOI:10.1016/j.knosys.2017.07.023 
[6]  V. Loia, F. Orciuoli, and W. Pedrycz, "Towards a granular computing approach based on Formal Concept Analysis for discovering periodicities in data". KnowledgeBased Systems , vol.146, pp.1–11, 2018. DOI:10.1016/j.knosys.2018.01.032 
[7]  W. Pedrycz and A. Bargiela, "Granular clustering: a granular signature of data". IEEE Trans. Systems, Man and Cybernetics , vol.32, pp.212–224, 2002. DOI:10.1109/3477.990878 
[8]  W. Pedrycz and A. Gacek, "Temporal granulation and its application to signal analysis". Information Sciences , vol.143, pp.1–71, 2002. DOI:10.1016/S00200255(02)001718 
[9]  J. Zhou, Z. Lai, D. Miao, C. Gao, and X. Yue, "Multigranulation roughfuzzy clustering based on shadowed sets", Information Sciences, in press, available online 30 May 2018. http://www.sciencedirect.com/science/article/pii/S0020025518304274 
[10]  Q. Wang and Z. Gong, "An application of fuzzy hypergraphs and hypergraphs in granular computing". Information Sciences , vol.429, pp.296–314, 2018. DOI:10.1016/j.ins.2017.11.024 
[11]  G. Chiaselotti, D. Ciucci, and T. Gentile, "Simple graphs in granular computing". Information Sciences , pp.340–304, 2016. 
[12]  S. K. Pal and D. B. Chakraborty, "Granular flow graph, adaptive rule generation and tracking". IEEE Transactions on Cybernetics , vol.47, no.12, pp.4096–4107, 2017. DOI:10.1109/TCYB.2016.2600271 
[13]  G. Chiaselotti, T. Gentile, and F. Infusino, "Granular computing on information tables: Families of subsets and operators". Information Sciences , pp.442–102, 2018. 
[14]  S. Salehi, A. Selamat, and H. Fujita, "Systematic mapping study on granular computing". KnowledgeBased Systems , vol.80, pp.78–97, 2015. DOI:10.1016/j.knosys.2015.02.018 
[15]  G. Chiaselotti, T. Gentile, and F., "Infusino, Knowledge pairing systems in granular computing". KnowledgeBased Systems , vol.124, pp.144–163, 2017. DOI:10.1016/j.knosys.2017.03.008 
[16]  C. Bisi, G. Chiaselotti, D. Ciucci, T. Gentile, and F. G. Infusino, "Micro and macro models of granular computing induced by the indiscernibility relation". Information Sciences , pp.388–273, 2017. 
[17]  P. Hońko, "Association discovery from relational data via granular computing". Information Sciences , vol.234, pp.136–149, 2013. DOI:10.1016/j.ins.2013.01.004 
[18]  H. Wang, J. Yang, Z. Wang, and Q. Wang, "A binary granular algorithm for spatiotemporal meteorological data mining", in Proc. 2nd IEEE International Conference on Spatial Data Mining and Geographical Knowledge Services (ICSDM), NJ, USA, 2015, pp. 511. http://ieeexplore.ieee.org/document/7298016/ 
[19]  X.Q. Tang and P. Zhu, "Hierarchical clustering problems and analysis of fuzzy proximity relation on granular space". IEEE Transactions on Fuzzy Systems , vol.21, no.5, pp.814–824, 2013. DOI:10.1109/TFUZZ.2012.2230176 
[20]  X. Wang, X. Liu, and L. Zhang, "A rapid fuzzy rule clustering method based on granular computing". Applied Soft Computing , vol.24, pp.534–542, 2014. DOI:10.1016/j.asoc.2014.08.004 
[21]  H. Liu, S. Xiong, and C.Wu A. and, "Hyperspherical granular computing classification algorithm based on fuzzy lattices". Mathematical and Computer Modelling , vol.57, pp.3–670, 2013. 
[22]  A. V. Savchenko, "Fast multiclass recognition of piecewise regular objects based on sequential threeway decisions and granular computing". KnowledgeBased Systems , vol.91, pp.252–262, 2016. DOI:10.1016/j.knosys.2015.09.021 
[23]  P. Singh and G. Dhiman, "A hybrid fuzzy time series forecasting model based on granular computing and bioinspired optimization approaches", J. of Computational Science, In press, 2018. http://www.sciencedirect.com/science/article/pii/S1877750317300923 
[24]  O. Hryniewicz and K. Kaczmarek, "Bayesian analysis of time series using granular computing approach". Applied Soft Computing , vol.47, pp.644–652, 2016. DOI:10.1016/j.asoc.2014.11.024 
[25]  Z. Han, J. Zhao, H. Leung, and W. Wang, "Construction of prediction intervals for gas flow systems in steel industry based on granular computing". Control Engineering Practice , vol.78, pp.79–88, 2018. DOI:10.1016/j.conengprac.2018.06.012 
[26]  J. Li, C. Mei, W. Xu, and Y. Qian, "Concept learning via granular computing: A cognitive viewpoint". Information Sciences , vol.298, pp.447–467, 2015. DOI:10.1016/j.ins.2014.12.010 
[27]  H. Hu, L. Pang, D. Tian, and Z. Shi, "Perception granular computing in visual hazefree task". Expert Systems with Applications , vol.41, pp.2729–2741, 2014. DOI:10.1016/j.eswa.2013.11.006 
[28]  J. MartínezFrutos, P. J. MartínezCastejón, and D. HerreroPérez, "Efficient topology optimization using GPU computing with multilevel granularity". Advances in Engineering Software , vol.106, pp.47–62, 2017. DOI:10.1016/j.advengsoft.2017.01.009 
[29]  M. Saberi, M. S. Mirtalaie, F. K. Hussain, A. Azadeh, and B. Ashjari, "A granular computingbased approach to credit scoring modeling". Neurocomputing , vol.122, pp.100–115, 2013. DOI:10.1016/j.neucom.2013.05.020 
[30]  S. S. Ray, A. Ganivada, and S. K. Pal, "A granular SelfOrganizing map for clustering and gene selection in microarray data". IEEE Transactions on Neural Networks and Learning Systems , vol.27, no.9, pp.1890–1906, 2016. DOI:10.1109/TNNLS.2015.2460994 
[31]  Y. Tang, Y.Q. Zhang, Z. Huang, X. Hu, and Y. Zhao, "Recursive fuzzy granulation for gene subsets extraction and cancer classification". IEEE Transactions on Information Technology in Biomedicine , vol.12, no.6, pp.723–730, 2008. DOI:10.1109/TITB.2008.920787 
[32]  G. Alefeld and J. Herzberger, Introduction to Interval Computations. New York: Academic Press, 1983. 
[33]  R. Moore, Interval Analysis. Englewood Cliffs: Prentice Hall, 1966. 
[34]  R. Moore, R. B. Kearfott, and M.J. Cloud, Introduction to Interval Analysis. Philadelphia: SIAM, 2009. 
[35]  D. Dubois and H. Prade, Outline of fuzzy set theory: An introduction, In Advances in Fuzzy Set Theory and Applications, M. M. Gupta, R. K. Ragade, and R. R. Yager (eds. ), NorthHolland, Amsterdam, pp. 2739, 1979. 
[36]  D. Dubois and H. Prade, "The three semantics of fuzzy sets". Fuzzy Sets and Systems , vol.90, pp.141–150, 1997. DOI:10.1016/S01650114(97)000808 
[37]  D. Dubois and H. Prade, "An introduction to fuzzy sets". Clinica Chimica Acta , vol.70, pp.3–29, 1998. 
[38]  G. Klir and B. Yuan, Fuzzy Sets and Fuzzy Logic: Theory and Applications. Upper Saddle River: PrenticeHall, 1995. 
[39]  H. Nguyen and E. Walker, A First Course in Fuzzy Logic. Boca Raton: Chapman Hall, CRC Press, 1999. 
[40]  A. Pedrycz, F. Dong, and K. Hirota, "Finite α cutbased approximation of fuzzy sets and its evolutionary optimization". Fuzzy Sets and Systems , vol.160, pp.3550–3564, 2009. DOI:10.1016/j.fss.2009.06.011 
[41]  W. Pedrycz and F. Gomide, Fuzzy Systems Engineering: Toward HumanCentric Computing. Hoboken, NJ: John Wiley, 2007. 
[42]  L. A. Zadeh, "Fuzzy sets". Information and Control , vol.8, pp.33–353, 1965. 
[43]  L. A. Zadeh, "The concept of linguistic variables and its application to approximate reasoning Ⅰ, Ⅱ, Ⅲ". Information Sciences , vol.8, pp.43–357, 1975. 
[44]  L. A. Zadeh, "From computing with numbers to computing with wordsfrom manipulation of measurements to manipulation of perceptions". IEEE Trans. on Circuits and Systems , vol.45, pp.105–119, 1999. 
[45]  P. Klement, R. Mesiar, and E. Pap, Triangular Norms. Dordrecht: Kluwer Academic Publishers, 2000. 
[46]  B. Schweizer and A. Sklar, Probabilistic Metric Spaces. New York: NorthHolland, 1983. 
[47]  W. Pedrycz, "Shadowed sets: representing and processing fuzzy sets". IEEE Trans. on Systems, Man, and Cybernetics, Part B , vol.28, pp.103–109, 1998. DOI:10.1109/3477.658584 
[48]  W. Pedrycz, "Interpretation of clusters in the framework of shadowed sets". Pattern Recognition Letters , vol.26, no.15, pp.2439–2449, 2005. DOI:10.1016/j.patrec.2005.05.001 
[49]  Z. Pawlak, "Rough sets". International Journal of Information and Computer Science , vol.11, no.15, pp.341–356, 1982. 
[50]  Z. Pawlak, Rough Sets. Theoretical Aspects of Reasoning About Data. Dordrecht: Kluwer Academic Publishers, 1991. 
[51]  Z. Pawlak, "Rough sets and fuzzy sets". Fuzzy Sets and Systems , vol.17, no.1, pp.99–102, 1985. DOI:10.1016/S01650114(85)800294 
[52]  Z. Pawlak and A. Skowron, "Rough sets and Boolean reasoning". Information Sciences , vol.177, no.1, pp.41–73, 2007. DOI:10.1016/j.ins.2006.06.007 
[53]  Z. Pawlak and A. Skowron, "Rudiments of rough sets". Information Scences , vol.177, no.1, pp.3–27, 2007. DOI:10.1016/j.ins.2006.06.003 
[54]  K. Hirota, "Concepts of probabilistic sets". Fuzzy Sets and Systems , vol.5, no.1, pp.31–46, 1981. DOI:10.1016/01650114(81)900324 
[55]  X. Liu and W. Pedrycz, Axiomatic Fuzzy Set Theory and Its Applications. Berlin: SpringerVerlag, 2009. 
[56]  K. Forbus, "Qualitative process theory". Artificial Intelligence , vol.24, pp.85–168, 1984. DOI:10.1016/00043702(84)900389 
[57]  W. AbouJaoudé, D. Thieffry, and J. Feret, "Formal derivation of qualitative dynamical models from biochemical networks". Biosystems , vol.149, pp.70–112, 2016. DOI:10.1016/j.biosystems.2016.09.001 
[58]  N. Bolloju, "Formulation of qualitative models using fuzzy logic". Decision Support Systems , vol.17, no.4, pp.275–298, 1996. DOI:10.1016/01679236(96)00005X 
[59]  F. Guerrin, "Qualitative reasoning about an ecological process: Interpretation in hydroecology". Ecological Modeling , vol.59, pp.165–201, 1991. DOI:10.1016/03043800(91)901773 
[60]  W. Haider, J. Hu, J. Slay, B. P. Turnbull, and Y. Xie, "Generating realistic intrusion detection system dataset based on fuzzy qualitative modeling". Journal of Network and Computer Applications , vol.87, pp.185–192, 2017. DOI:10.1016/j.jnca.2017.03.018 
[61]  Y. H. Wong, A. B. Rad, and Y. K. Wong, "Qualitative modeling and control of dynamic systems". Engineering Applications of Artificial Intelligence , vol.10, no.5, pp.429–439, 1997. DOI:10.1016/S09521976(97)000298 
[62]  J. Žabkar, M. Možina, I. Bratko, and J. Demšar, "Learning qualitative models from numerical data". Artificial Intelligence , vol.175, pp.9–1619, 2011. 
[63]  M. Li and D. Wang, "Insights into randomized algorithms for neural networks: practical issues and common pitfalls". Information Sciences , pp.382–178, 2017. 