## Selecting and transforming columns

When selecting variables of a data frame, we do add `:`

to indicate that it’s a variable name. Thus for example, if we want to select column with variable `x`

, we use `:x`

inside `select()`

.

First let’s construct some 500 by 3 dimensional data frame called `df`

that contains three columns each of which has name of `A`

, `B`

, and `C`

.

```
using DataFrames
df = DataFrame(
A=1:2:1000,
B= repeat(1:10, inner=50),
C= 1:500)
```

Each of below codes produces 500 by 1 dataframe, a data frame that only contains column `A`

.

```
select(df, :A)
df[:, [:A]]
```

For example, see the first 10 rows of `select(df, :A)`

.

```
select(df, :A) |> (x->first(x, 10))
```

Output:

```
10×1 DataFrame
Row │ A
│ Int64
─────┼───────
1 │ 1
2 │ 3
3 │ 5
4 │ 7
5 │ 9
6 │ 11
7 │ 13
8 │ 15
9 │ 17
10 │ 19
```

## Rename

Below codes allow us to rename the exisitng columns, from `A`

and `B`

to `a`

and `b`

. Note that we use broadcasting `.=>`

to make this code work.

```
select(df, [:A, :B].=>[:a, :b])
```

## Create a new column

Below we create a new column called `C`

that adds two columns, `A`

and `B`

, element-wise.

```
select(df, :, [:A, :B]=>((a, b)->a.+b)=>:C)
```

## Pipe operator

Pipe operator `|>`

is in Julia Base package. I was pleasantly surprised that Julia has similar operator to pipe operator, `%>%`

, in R.

The pipe operator is a helpful tool for nesting multiple functions within one another but in a concise and legible way.

For example suppose we want to raise a vector `vec`

to the power of 3 and then sum the results. There are several ways to achieve this, but we can conveniently use the pipe operator to accomplish the task, regardless of which way we choose.

```
vec=[1,2,3,4,5]
vec .^3 |> sum
[vec[i]^3 for i in 1:5] |>sum
vec |> x->x.^3|>x->sum(x)
```