Skip to main content

Featured

Adobe Experience Manager - Create an OSGI Configuration

 In this article, let's create an OSGi configuration, configure it and use it in AEM code. So now let's get started with the creation of an OSGi configuration. Technical details and Assumptions: All the following details are tested in AEM version 6.5.8, Java version 11.0.2 Creation of OSGi configuration: To create an OSGi configuration we need to create an ObjectClassDefinition. I have included a sample OCD configuration, which can be used as a reference to create one. The next step would be to create an interface and an implementation that can help fetch the OSGi configurations.  Interface: Implementation: Let's try to use the OSGi configuration created so far in Models/Servlets. For demonstration purposes, I used AEM Models here, but the same can be implemented in Servlets too. Now that we have created the OSGi configuration. Once building the code, we should be able to see the OSGi configuration in the web console (http://localhost:4502/system/console/configMgr) C...

Numpy V Pandas

Numpy and Pandas are not two separate packages, they are used together all the time. 
With a different perspective, I write this post to share the differences between them, and which is best for a particular situation. They are powerful and are the building blocks of Data Analysis and Scientific Computations.Before we go to the main content, we can know about those briefly.

Numpy

The NUMerical PYthon has nd.array(N Dimensional) functionality, which holds homogeneous data. It is built with the help of Cython and lower level functionalities. These nd.arrays are pretty much similar to the conventional array/list, but it is faster that them.

>>> import numpy as np
>>> a = np.arange(10).reshape(2,5)
>>> a
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

We can get the datatype of the array, by using dtype.name attribute.

>>> a.dtype.name
'int64'
>>> type(a)
<type 'numpy.ndarray'>

For more info: Numpy Documentation

Pandas

The pandas is short for PANel DAta has dataframe, which holds heterogeneous data. It is built on top of numpy, thereby inheriting the qualities. They are similar to numpy arrays normally, but it can do more that. We can also convert pandas dataframe to numpy arrays, but it is a costly operation which requires typecasting.

>>> import numpy as np
>>> import pandas as pd

1 Dimensional pandas arrays are called as Series, and can be created like this

>>> s = pd.Series([1, 3, 5, np.nan, 6, 8])
>>> s
0     1
1     3
2     5
3   NaN
4     6
5     8
dtype: float64

2 Dimensional pandas arrays are called as Data Frame, We are going to create a Data Frame but first we create series of indices.

>>> dates = pd.date_range('20130101', periods=6)
>>> print(dates)
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-01-01, ..., 2013-01-06]
Length: 6, Freq: D, Timezone: None

After creating the indices, we create the data frame

>>> df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD'))
>>> df
                   A         B         C         D
2013-01-01 -0.228804  1.756711  0.029835  0.589072
2013-01-02 -0.214418  0.073005 -0.339403 -0.523901
2013-01-03  0.515138 -0.603327  0.785776 -0.661374
2013-01-04 -0.154879 -1.164844 -1.618861  0.904558
2013-01-05 -0.669651 -1.488846  1.431594  1.468455

2013-01-06  1.037434 -0.596740 -0.451529 -0.288568


Numpy Vs Pandas
  • Numpy consumes less memory, can perform faster when compared with Numpy.
  • Pandas has wide range of functions to read tablular files such as CSV, TSV, etc. and also can get data from realtime database like MySQL.
  • Numpy performs better when it has 50K rows, whereas the pandas can perform well with more than 500K rows. 
  • We can integrate Numpy with C/C++ and Fortran code.
  • Both are not independent, they might look as two different package,but they are not! Pandas is built on top of Numpy.
  • Slicing works different between them.
  • In Dataframe, we can have column names which looks more readable. 

Comments

Popular Posts