Skip to main content

Guides

Operations

This section continues by presenting the op_pandas library guide and addressing some of the available operations you can perform on PrivateDataFrame and PrivateSeries objects. The following operations are addressed:

Unary Ops

Users can perform unary operations such as ~, -, +, and abs on PrivateDataFrames and PrivateSeries. These operations apply element-wise to the data.

  • The ~ operator performs the bitwise negation operation.
  • The - operator performs the arithmetic negation operation.
  • The + operator performs the arithmetic addition operation.
  • The abs() function calculates the absolute value of each element.

See the following example:

%%ag

# Export the quick statistics of the original PrivateDataFrame 'priv_df_2' and its negative counterpart
export(priv_df_2.describe(eps=2), 'original')
export((-priv_df_2).describe(eps=2), 'negative')

When executed:

>>> 
Setting up exported variable in local environment: original
Setting up exported variable in local environment: negative
# Rename columns of the negative DataFrame for clarity
negative.columns = ["a_neg", "b_neg"]

# Join the original and negative DataFrames and print the result
print(original.join(negative, how="left"))

Output:

>>>
a b a_neg b_neg
count 10000.000000 10000.000000 10000.000000 10000.000000
mean 1.498597 1.504388 -1.500690 -1.503697
std 0.494197 0.498858 0.498402 0.499788
min 1.000000 1.000000 -1.000000 -1.000000
25% 1.000000 1.000000 -1.001932 -1.001540
50% 1.635791 1.891447 -1.678784 -1.161680
75% 1.991538 1.997409 -1.996417 -1.996500
max 1.992424 1.997140 -1.999776 -1.999894

Where:

  • The quick statistics (count, mean, std, min, 25%, 50%, 75%, max) of the original PrivateDataFrame priv_df_2 and its negative counterpart are exported to the local environment.
  • The negative DataFrame is created by applying the unary - operator to the original PrivateDataFrame priv_df_2.
  • The columns of the negative DataFrame are renamed for clarity.
  • The original and negative DataFrames are joined together, and the result is printed, showing the element-wise application of the unary - operator.

Binary Ops

Users can apply binary operations using scalars and PrivateDataFrames against PrivateDataFrames. See the example below:

%%ag

# Select the 'age' and 'salary' columns from the PrivateDataFrame 'priv_df' and obtain a PrivateDataFrame 'pdf'
pdf = priv_df[['age', 'salary']]

# Perform binary operations on 'pdf' with a mix of scalars and 'pdf' itself
result1 = pdf + (10 * pdf) # Expected min-max: Age: (0, 704), Salary: (11, 2200000)
result2 = result1 / 1000 # Expected min-max: Age: (0, 0.704), Salary: (0.011, 2200)

# Print the metadata of the resulting PrivateDataFrames 'result1' and 'result2'
ag_print("Result1 metadata: \n", result1.metadata)
ag_print("Result2 metadata: \n", result2.metadata)

When executed:

>>> 
Result1 metadata:
{'age': (0.0, 704.0), 'salary': (11, 2200000)}
Result2 metadata:
{'age': (0.0, 0.704), 'salary': (0.011, 2200.0)}

In it:

  • The 'age' and 'salary' columns are selected from the PrivateDataFrame priv_df to create a new PrivateDataFrame pdf.
  • Binary operations are performed on pdf using a mix of scalars and pdf.
  • result1 is obtained by adding pdf with 10 times pdf, and result2 is obtained by dividing result1 by 1000.
  • The resulting PrivateDataFrames result1 and result2 metadata are printed, showing the updated metadata bounds after the binary operations.

Bitwise Ops

Users can apply bitwise operations using scalars and PrivateDataFrames against PrivateDataFrames. These operations apply element-wise to the data.

See the following example:

%%ag
import numpy as np
import pandas as pd

# Create two PrivateSeries with randomly sampled integer data containing values in the range (0,1)
priv_ser_1 = PrivateSeries(pd.Series(np.random.randint(0, 2, 10000)), metadata=(0, 1))
priv_ser_2 = PrivateSeries(pd.Series(np.random.randint(0, 2, 10000)), metadata=(0, 1))

# Print the description of the first PrivateSeries
ag_print("Describe of private Series 1: \n", priv_ser_1.describe(eps=1))

# Print the description of the second PrivateSeries
ag_print("Describe of private Series 2: \n", priv_ser_2.describe(eps=1))

# Apply the bitwise AND operation between priv_ser_1 and priv_ser_2 and store the result in 'result'
result = priv_ser_1 & priv_ser_2

# Print the description of the resulting PrivateSeries
ag_print("Describe of the result: \n", result.describe(eps=1))

When executed:

>>> 
Describe of private Series 1:
count 9.998000e+03
mean 1.571300e-03
std 1.998231e-02
min 0.000000e+00
25% 4.656613e-10
50% 4.656613e-10
75% 4.656613e-10
max 4.656613e-10
Name: series, dtype: float64

Describe of private Series 2:
count 1.000500e+04
mean 5.608570e-04
std 4.612496e-02
min 0.000000e+00
25% 4.656613e-10
50% 4.656613e-10
75% 4.656613e-10
max 4.656613e-10
Name: series, dtype: float64

Describe of the result:
count 1.000700e+04
mean 3.952059e-04
std 2.277582e-02
min 0.000000e+00
25% 4.656613e-10
50% 4.656613e-10
75% 4.656613e-10
max 4.656613e-10
Name: series, dtype: float64

In it:

  • Two PrivateSeries priv_ser_1 and priv_ser_2 are created with randomly sampled integer data containing values in the range (0,1).
  • The descriptions of both PrivateSeries are printed, displaying the count, mean, std, min, 25%, 50%, 75%, and max values.
  • The bitwise AND operation (&) is applied between priv_ser_1 and priv_ser_2, and the result is stored in result.
  • The description of the resulting PrivateSeries result is printed, showing the statistics of the element-wise bitwise AND operation.
Continue the op_pandas guide.

See the Functions, Joins and Statistical Methods page to continue following the op_pandas guide.