Autodesk AutoCAD is a widely known program by engineers and designers used to create 2D and 3D models, it has a great interface with lots of options. It's an intuitive easy-to-learn software which allows the user to achive great results fastly. CAD stands for Computed-Aided-Design, and Auto for Autodesk. Last stable version has been launched on 2022.

At the same time, being able to use AutoCAD through programming can give us a powerful tool to solve many problems. This article explains how to do this. We'll use .NET AutoCAD API to create a plugin that will compute the Shortest Path for a given graph, from each node to every other node (using Dijskstra Algorithm).

Our input is an `undirected graph`

where the edge weights correspond to their length.

The output we want to get are 2 tables:

- Shortest path
`distance`

from each node to every other node. - Shortest path
`route`

from each node to every other node.

Now, for example, if we want to know the shortest path between B and E, we know the shortest route is `B I G J E`

and the length of that path is 48,79.

Here we'll make a very simple plugin where, once loaded in AutoCAD, will respond to the command `hello`

and draw the following circle in the model. This will be useful to learn the first steps to create any .NET plugin for AutoCAD.

First step is download and install Visual Studio Community, then create a fresh new project selecting `c# Class Library`

for `.NET Framework`

, we can name it `MyFirstCadPlugin`

.

Once new project is created, then we need to add `AutoCAD dll references`

to access the .NET AutoCAD API. These references are listed below, and located in the `Program Files`

folder, where AutoCAD is installed.

- acmgd.dll
- acdbmgd.dll
- accoremgd.dll

Now we have to configure the debug project properties, setting the option `start external program`

to start AutoCAD (`acad.exe`

) while debugging.

Next we should uncheck `loader lock`

in the Exception Settings in order to allow Visual Studio to execute AutoCAD while debugging.

We can use the following code in `Class1.cs`

to create the plugin. This code, as its explained in the comments, first connects to the active AutoCAD document and database, then creates a transaction where a Circle and Text entities are defined.

Finally, if we press `Start`

in Visual Studio, new AutoCAD instance will appear, then we can load our plugin typing the "netload" command and searching for the `MyFirstCadPlugin.dll`

, stored in `/bin/Debug`

, in our cSharp project files. Once loaded, by pressing hello in the command bar, the circle with "hello!" inside will appear in the model!

We already know how to create a basic .NET plugin for AutoCAD, so we can go deeper and focus in our real goal, which is to create a program that will compute the `Shortest Path Matrices`

for a given graph. The following windows form summarizes the functionality of the program.

Here is explained how it works:

`Insert Sample Graph`

button will draw into the model a sample graph. This is useful to show the user an example to try the program. In order to use a custom graph, block nodes should same type as in the sample graph (block's name: "node", and with a text label).`Generate Shortest Path Matrix`

button will prompt the user to select a graph, and then will generate the output matrices and save them as CSV files in the selected folder by the user.

We know a graph is composed by a set of edges and nodes, but we have to use AutoCAD elements to represent them. The edges can be easily treated as `Lines`

or `Polylines`

, but for the nodes there is not such a direct AutoCAD object. Every node has 2 properties: position (x and y), and label, for example, the following picture shows a node where label = "B", and position (x = 138,89, y = 169,11).

There is an element in AutoCAD that can be used to represent nodes in a very simple and natural way, and it's called `Block`

. We will create a custom block with the desired shape to use it to represent the nodes.

The following code (commented below) is used to create custom blocks in cSharp.

These are the main functions:

`CircleBlockNodeEntities`

This method returns a list of entities to create a block node shaped by a circle and a letter inside. There are 2 entities in this block: circle and text.`LeaderBlockNodeEntities`

Returns a list of entities to create a block node shaped by a leader line with its label above, like the following picture. There are 3 entities in this block: polyline, circle, and text.

`InsertBlockNodeToDb`

This method creates a block into the current model database, uses as argument the list of entities returned by one of the methods explained before, and the name we want to give to that block. For example, the following code will create a block named "node", with the`CircleBlock`

entities.

```
List<Entity> blockNodeEntities = BlockNodeCreator.CircleBlockNodeEntities(acCurDb, new Point3d(0, 0, 0));
BlockNodeCreator.InsertBlockNodeToDb(bt, acDoc, acCurDb, "node", blockNodeEntities);
```

`DrawBlockNodeToModel`

This function draws into the model a block node, receives as arguments the block's name, label, and its position. For example, the following code will draw a block named "node", with the label "B" in the (20, 100, 0) position.

```
DrawBlockNodeToModel(bt, acBlkTblRec, "node", "B", new Point3d(20, 100, 0));
```

We have solved the way we are going to represent a graph through AutoCAD elements, now we have to add some functionality to draw an entire sample graph. But from where are we going to read the info to draw that sample graph? Or How are we going to tell the program the set of edges (polylines or lines), and nodes (blocks) to be drawn?

Here is where `CSV files`

(tables), can help us to do the job. The sample graph will be described with 2 separate csv files, one for the nodes, and another one for the edges, they will be structured as follows.

`nodes.csv`

`edges.csv`

Where each row of `nodes.csv`

defines a node, with its label and position, and `edges.csv`

has the information of a polyline vertex. These CSV files are embedded files in the Resource Folder. Next image corresponds to the 2 polylines defined in the above table:

With these 2 CSV files and the appropiate code to read them we can draw any sample graph into the current AutoCAD model. Next is presented the code to do this.

This code can be summarized as follows:

- Function
`GetNodes`

reads the corresponding csv file and return a list of nodes. - Function
`GetEdges`

reads the CSV file and return a dictionary where*keys*are the`polyline_id`

and*values*are`Polyline AutoCAD objects`

. Function InsertSampleGraph draws into the model the sample graph defined by the CSV files, through the 2 functions defined above. - Function
`InsertSampleGraph`

draws into the model the sample graph defined by the CSV files, through the 2 functions defined above.

So far we know how to represent a graph with AutoCAD, and how to plot a sample one. It's time to attack our main goal, which is, for a given graph, get the Shortest Path Matrices (one for the shortest distance, and the other with the path to achive that distance.

First we need to prompt the user to select a graph in the model, we do this through the following piece of code.

This code prompts the user to select the graph, then returns an array of `ObjectId`

with all the selected elements. This function is pretty reusable for other AutoCAD plugins we want to build, because often we'll need the user to select something in the model.

Next is presented the code to perform Dijkstra and save the Shortest Path Matrices as CSV files.

The logic this code follows is:

Filter the

`ObjectId`

array that comes from the`GraphModelSelector`

function presented above. Every polyline and line will be converted to an`Edge`

, and every block node to a`Node`

.Generate

`Adjacency Matrix`

from the list of edges and nodes. We create a dictionary from the nodes list, where the*key*is a Tuple with the coordinates point, and the*value*is a Tuple with node's label and index. Then, if we iterate for every edge, and check if both its`start_point`

and`end_point`

are a key in the dictionary, we can update the adjacency matrix because that points are connected at a distance as the edge length. Next piece of code explains this (see lines 117 to 140 from the previous gist). Below is presented the adjacency matrix for our sample graph.Build a function to perform Dijkstra algorithm having as an argument the adjacency matrix, the starting point, and the node list. This function is called

`PerformDijkstra`

as you can see in the above gist, and will return an array of the struct`DistanceAndRoute`

. For example, if we invoke the function for the second node (labeled B), will return an array with the shortest distances from node B to every other node, and another array with the routes associated to that paths. See picture below.Finally we

`PerformDijkstra`

from every node in order to obtain the output we want.`GenerateShortestPathMatrix`

does this job and returns as an output the Shortest Path Matrices into 2 CSV (one for the distances, and the other for the routes).

And that's it! We've built the plugin and it does exactly what we wanted!

Once we are sure we have tested our program it's time to move from the `debug`

mode into `release`

mode, we can change this in the menu Build "Configuration manager".

In order to load the plugin, we open AutoCAD and type "netload" in the commands bar. Then a menu will show up, we must search into our project files, in the Release folder we select the dll with the project`s name, for example`

ShortestPathMatrix.dll`.

Now we can type `shortestpath`

in the command bar, and our form will appear! Our plugin is ready to be used!

I hope you liked reading this article, as I said in the beginning, being able to use AutoCAD by coding is a powerful tool we can use to solve many problems. In this Github repository you can find all the project files.

If you enjoyed this story, please click the 👏 button and share to help others find it! Feel free to leave a comment below. You can connect with me on Medium, LinkedIn, Twitter, Facebook.

]]>In this article we are going to use a dataset of employees as an example to find insights and relations between variables using regression, and to interpret the result reports.

Each employee's address is described as `latitude`

and `longitude`

, so we could check if there is a correlation between these variables and the employee wage. It's important to note that is fictional data (employees live in the middle of pacific ocean 🌎😂).

As we can see in the following graphs both variables seem to be correlated with wage.

- As latitude and longitude increase (north-east direction), wages grow
- As wages grow, seems they prefer to live in the north-east of the city

But... Is there causality? If yes, which is the dependent and the independent variable? In other words, which variable is the causal of the other? This answer is not always easy to find but in this example we can infer the company is located in a city where better neighborhoods are in the north-east. So, as employees earn more, they prefer to move to the north-east of city. Wage is the independent variable and latitude and longitude, dependents.

**Definitions**:

**Causal Effects for variable X**: Changes in outcomes due to changes in X, holding all the rest of the variables constant. Later we are going to make a model to predict employees' wage based on several variables like gender, age, location, etc. We can say there is a causal effect on wage due to gender if, holding the rest of variables constant, and changing the gender, causes a change in wage.**Confounding variable**: Variable that influences both the dependent variable and the independent variable, causing a spurious association. Imagine we find that motorbike accidents are highly correlated with the sale of umbrellas. As umbrellas' sales go up, motorbike accidents increase. We could think that umbrellas' sales are the causal of motorbike accidents, but what really happens is that rain is affecting both variables (umbrellas and accidents).

Now we are going to interpret the linear regression report, taking as an example the `wage`

-`latitude`

regression. This is the equation for simple linear regression:
$$
y = \alpha + \beta \ x
$$

Next table shows the regression report:

**R-squared**: This number is the % of the variance explained by the model. In our case it's just 2%, a very low number, but still positive (better than an horizontal line with the mean value). $$ R^2 = 1 - \frac{\text{unexplained variation}}{\text{total variation}} = 1 - \frac{SS_r}{SS_t} = 1- \frac{\sum_i y_i-\hat{y}}{\sum_i y_i-\overline{y}} $$

**Adjusted R-squared**: R-squared comes with an inherent problem, the fact that if we add any independent variable to the regression (multilinear regression), even if the variable doesn't have any relation with the dependent one, R-squared will increase or keep equal. The adjusted R-square "fixes" that problem. Adjusted R-squared is always less than or equal to R-squared. $$ R^2_{\text{adjusted}} = 1- \frac{(1 - R^2) \ (n-1)}{n-k-1} $$ Where \(n\) is the number of points in our data sample, and \(k\) the number of independent variables.**F-statistic**: This test is used to see if we can reject the following null hypothesis: $$ H_0: \beta = 0 $$ $$ H_1: \beta \neq 0 $$ If we can't reject H0 means that our regression is useless, because our coefficient is not statistically significant. As we can see in the output for the`wage`

-`latitude`

regression, p-value is less than 5%, then we can reject H0, meaning that our slope coefficient is statistically significant.**Log-likelihood, AIC, BIC**: Without getting too into the math, the log-likelihood (\(l\)) measures how strong a model is in fitting the data. The more parameters we add, log-likelihood will increase, but we don't want our model to over-fit, that's why we add the number of parameters (\(k\)) into the equation. $$ \text{AIC} = 2 \ k - 2 \ l $$ $$ \text{BIC} = \ln{(n)} \ k - 2 \ l \ $$ When comparing models, we should pick the one with the lowest AIC and BIC (low number of parameters and highest log-likelihood). AIC and BIC differ in the first coefficient, BIC is the one to use if the models we're comparing have different number of samples, because it normalizes it with the term (\(\ln{n}\))**Variables section**: This is maybe the most important part in the regression output. It means that our equation would look like follows: $$ \text{latitude} = -10,73 + 1.67 \times 10^{-6} \ \text{wage} $$ The rest of the table (standard_error, t, p_value, confidence_interval), is showing us in reality one piece of information in different ways, and that's the coefficient statistical significance. The constant term (const) doesn't tell us too much (theoretically would be the wage for latitude = 0), but has to be there to build our line equation. In our example, wage term has p_value = 0,4%, < 5%, so we can consider it's statistical significant.

Usually reality is too complex to explain one term with just one parameter, that's the reason why we want to add more variables in our regression: $$ y = \beta_0 + \beta_1 \ x_1 + \beta_2 \ x_2+ \beta_3 \ x_3 + \text{...} + \beta_i \ x_i $$

Following our employees' example dataset, now we're going to make a model to predict the wage based on `latitude`

and `longitude`

(the other way around than before). Later we'll make another model with more parameters and check if our regression improves.

This is our regression outcome:

As we can see, this regression doesn't have much value for the following reasons:

- F-statistic p_value is greater than 5%, meaning that we can't reject the null hypothesis: $$ H_0: \beta_1 = \beta_2 = 0 $$
- All coefficients p_value are also greater than 5%.
- R-squared is less than 2%.

Now we're going to try to improve the model adding the following variables:

- Gender
- Age
- Nationality
- Civil status
- Contract type (fixed or indefinite term)
- Management level (top-level, middle-level, low-level, laborer)

As you can see, almost all of these variables are categorical (except age), then, in order to apply regression, we have to convert them into dummy variables. We can do these easily with `pandas`

as follows:

```
df = pd.get_dummies(df, columns=['gender', 'nationality_group', 'management_level', 'contract_type'], drop_first=True)
```

We use `drop_first`

because in categorical variables, if we know (n-1), we can infer the missing one (example: contract_type_indefinite_term = 0 means fixed_term contract).

Next regression output is shown:

This model is much better than the one before:

- F-statistic p_value < 5%
- Many of the regression-parameters are statistically significant. Higher t-values correspond to
`management_level`

, meaning that that variable is clearly affecting`wage`

. - R-squared is explaining 83% of variance (highly improvement from the previous model).
- AIC and BIC are lower than the previous model.

Now we can measure model's accuracy through the following concepts:

- MAE: Mean Absolute Error = 3.912 USD $$ \text{MAE} = \frac{1}{n} \sum_{i=1}^{n} |\text{actual_values} - \text{predicted_values}| $$
- RMSE: Root Mean Squared Error = 6.578 USD $$ \text{RMSE} = \sqrt{ \frac{1}{n} \sum_{i=1}^{n} (\text{actual_values} - \text{predicted_values})^2} $$

We use logistic regression when the dependent variable is categorical. For example, using our employees' dataset, let's say we want to predict whether an employee is a laborer or not, based on wage. $$ p(x) = \frac{1}{1+e^{-(\beta_0+\beta_1 \ x)}} $$

Next table is the logistic regression output:

The same way we can make a multilinear regression, we can build a multilogistic regression, using more than 1 independent variable. Following same example as before, we can try to predict whether an employee is a laborer or not, not only with wage, but also with other parameters like age, gender, nationality, etc.

$$ p(x) = \frac{1}{1+e^{-(\beta_0+\beta_1 \ x_1 +\beta_2 \ x_2+ \text{...} + \beta_m \ x_m)}} $$

Here you can find the Jupyter notebook and the dataset. used to write this article.

Thank you for reading!

]]>Then someone taught me about the Central Limit Theorem and how to compare means with the t-test, but at that time I didnt fully understood those concepts, and I realized that I just learnt how to solve problems and exercises about it.

I think its important to understand the Whys and the intuition behind the theory before learning the mechanics of solving exercises about them. Now tools like Python facilitates us a lot to do that, thats the purpose of this article. We are going to use Python to illustrate the first steps towards inferential statistics, with key concepts like Central Limit Theorem or the t-tests. As I usually say: When I code it, then I understand it.

CLT states that if we have a population of some variable and take m samples of n-size, and we calculate some parameter in each sample (for example mean, median, standard deviation, etc), the distribution of that m parameters will be normal as n increases, and its variance will decrease also as n increases (distribution curve will narrow). This is true even if the original population doesnt follow a normal distribution.

As said before, I think the best way to understand CLT is to practice with some data and obtaining the expected results. We are going to use the CLT with 3 distributions computing the mean, the median and the standard deviation.

- Uniform distribution
- Normal distribution
- Binomial distribution

Now we are going to check the CLT by plotting histograms of sample parameters and show how they change if we increase n (sample-size). Well also plot the associated normal curve, and for that well need to know the standard error.

Its important to say that CLT is usually applied for the mean, but actually, as well see, we can apply it for any parameter (median, variances, standard deviation). This is relevant because, for example, sometimes the median tells us much more than the mean.

What is the Standard Error? Its the standard deviation of the sampling distribution. It decreases as n (sample size) increases. Each parameter will have associated a different standard error formula. Next we show the SE expression for the SE mean, SE median, SE std.

We can prove this is true using the previous function. Well compute manually the standard deviation and compare it to the formula value, they should be similar.

As we can see, results are similar:

```
{
"binomialdist_example (mean, n=200)": {
"computed_se": 0.0504901583850754,
"formula_se": 0.05815619602703747
},
"normaldist_example (median, n=200)": {
"computed_se": 0.004247561325654817,
"formula_se": 0.004343732383749954
},
"uniformdist_example (std, n=200)": {
"computed_se": 0.013055146058296654,
"formula_se": 0.02204320895687334
}
}
```

Now we are going to build a function to plot histograms and distribution curves as n (sample size) increases:

As we can see, as the CLT theorem states, with any kind of distribution, if we take m samples of n-size, and compute some parameter on them (for example mean, median or standard deviation), the distribution of that m parameters will be normal, and its variance will decrease as n increases (distribution curve will narrow).

Once we have a notion about what CLT is about, now we can apply this knowledge to understand the t-test. T-test is used normally for the following cases:

We want to check if the population mean is equal or different from the sample mean. Here we are using directly CLT theorem with this t statistic (We assume s because we dont know the population variance): $$ t = \frac{\overline{x}-\mu}{{\frac{s}{\sqrt{n}}}} $$

We want to check if given 2 samples, the population mean of them is equal or different. Assumptions:

**Homogeneity of Variance:**

Population variances are assumed to be equal. I think we can have an intuition about the reason of this assumption because the t-test is actually using the CLT theorem to compare 2 means. We have to apply the proper standard error. The standard error depends on the sample size and population variance. Different sample sizes and variances will lead to different standard errors.

We can find an adjusted standard error if sample sizes are different, but for different variances its better to apply a whole different test (Welch test). These are the cases and its corresponding t-values with their proper standard errors:

**Sample independence:**

It means that there are 2 different groups, that its not the same group that has been measured twice. If the samples are paired (dependent) t-statistic is very similar to the One sample test one, but our variable is the difference between samples.

$$ t = \frac{\overline{d}-\mu_d}{{\frac{s_d}{\sqrt{n}}}} $$

Note: In all t-tests (1 or 2 sample test) we assume population follows a normal distribution, but as we have seen, CLT theorem states that as n increases, the sample mean (or other parameters) will follow a normal distribution.

Next we are going to do some examples for each test mentioned above, and we will also check that the t-statistics are correct plotting the histogram and the distribution curve (as done before with CLT).

We have the potato yield from 12 different farms. We know that the standard potato yield for the given variety is 𝜇=20. x = [21.5, 24.5, 18.5, 17.2, 14.5, 23.2, 22.1, 20.5, 19.4, 18.1, 24.1, 18.5] Test if the potato yield from these farms is significantly better than the standard yield.

We found there is a 42% chance that 𝜇=x, based on our sample and its mean and standard deviation. So we cant reject H0, we cant conclude that 𝜇<x.

1 sample is extracted from normal-distributed population. The sample mean is x = 50 and standard deviation 𝑠=5. There are 30 observations. Considering the following hypothesis:

H0: 𝜇=48

H1: 𝜇48

With significance of 5%, can we reject H0?

5% of significance means 2,5% per tail, as we can see in the following picture:

We can reject H0 with 5% of significance, meaning that is very likely that 𝜇48.

As said before, we want to check if given 2 samples, the population mean of them is equal or different. Here were going to check that the following t-statistic is correct by plotting histograms for the general case: Equal or unequal sample sizes, similar variances. After that well do an example exercise for each case.

In order to do this check well tweak plot_histograms_sample_parameter function, generating m samples of the difference between their sample means (with different n size). Better than describing it with words, its easy to understand reading code:

As we can see, the blue normal curves fit well in the histograms, it means that our t-statistic is correct.

We can measure persons fitness by measuring body fat percentage. The normal range for men is 1520%, and the normal range for women is 2025% body fat. We have 2 sample data from a group of men and women. The following dictionary shows the data.

```
example3_data = {
"men": [13.3, 6.0, 20.0, 8.0, 14.0, 19.0,
18.0, 25.0, 16.0, 24.0, 15.0, 1.0, 15.0],
"women": [22.0, 16.0, 21.7, 21.0, 30.0,
26.0, 12.0, 23.2, 28.0, 23.0]
}
```

Using t-test we want to know if there is difference significance between the population mean of men and women group. These will be our hypothesis.

H0: 𝜇_men = 𝜇_women

H1: 𝜇_men 𝜇_women

We can reject H0 with 5% of significance, meaning that is very likely that 𝜇_men 𝜇_women.

A study was conducted to investigate the effectiveness of hypnotism in reducing pain. Results are shown in the following dataframe. The before value is matched to an after value. Are the sensory measurements, on average, lower after hypnotism? Test at 5% significance level.

As we see, we can reject HO with a p-value of 0.009478. So, based on our data, its very likely that hypnotism is reducing pain.

Hope this article helped to understand inferential statistic key concepts as Central Limit Theorem and how t-test work, gaining confidence when applying them. Here you can find the full Jupyter Notebook used for writing this story.

[1]: Ahn S., Fessler, J. (2003). Standard Errors of Mean, Variance, and Standard Deviation Estimators. The University of Michigan.

[2] machinelearningplus.com. One Sample T Test Clearly Explained with Examples | ML+. (2020, October 8). https://www.machinelearningplus.com/statistics/one-sample-t-test/

[3] bookdown.org. Practice 13 Conducting t-tests for Matched or Paired Samples in R. Retrieved April 9, 2022 from https://bookdown.org/logan_kelly/r_practice/p13.html

[4] jmp.com. The Two-Sample t-test. Retrieved April 9, 2022 from https://www.jmp.com/en_ch/statistics-knowledge-portal/t-test/two-sample-t-test.html

]]>\begin{equation} k_a: \begin{cases} x^2+ \left( y-\frac{r}{2} \right)^2 = r^2 \newline z=0 \end{cases}\ \end{equation}

\begin{equation} k_b: \begin{cases} \left( y-\frac{r}{2} \right)^2 + z^2 = r^2 \newline x=0 \end{cases}\, \end{equation}

In this article well parameterize this beautiful surface, and show that its surface is the same as the sphere (\(4 \ \pi \ r^2\)), apart of some other properties.

As mention above, the oloid is a ruled surface, and its formed by the segments AB, where A belongs to \(k_a\) and B to \(k_b\), respectively, along both circles.

\[ A = \left(\begin{array}{ccc}r\,\sin\left(\alpha\right), & -\dfrac{r}{2}-r\,\cos\left(\alpha\right), & 0 \end{array}\right) \]

\[ \beta = \pi - \alpha/2 \]

\[ \sin(\beta) = sin(\pi - \alpha/2) = cos(\alpha) \]

\[ |\overrightarrow{\rm TM_A}|\, \sin(\beta) = r \implies |\overrightarrow{\rm TM_A}| = \left| \dfrac{r}{\cos(\alpha)}\right | \]

\[ T = \left(\begin{array}{ccc} 0, & -\dfrac{r}{2}-\dfrac{r}{\cos\left(\alpha\right)}, & 0 \end{array}\right) \]

\[ |\overrightarrow{\rm TM_B}|^2 = |\overrightarrow{\rm TB}|^2 + r^2 \] \[ |\overrightarrow{\rm TM_B}|^2 = \left( \dfrac{r}{2}+\dfrac{r}{\cos\left(\alpha\right)}+\dfrac{r}{2} \right)^2 = \left( \dfrac{r + r\ cos(\alpha)}{\cos\left(\alpha\right)} \right)^2 \] \[ \cos(\gamma) = \dfrac{-r}{|\overrightarrow{\rm TM_B}|} = \dfrac{-\cos\left(\alpha\right)}{1 + cos(\alpha)} \]

\[ B_y = \dfrac{r}{2}+r\ \cos(\gamma) = \dfrac{r}{2} - \dfrac{r\ \cos\left(\alpha\right)}{1 + cos(\alpha)} \]

\[ B_z = r\ \sin(\gamma) \]

\[ \sin(\gamma)^2 = 1 - \cos(\gamma)^2 = 1 - \left( \dfrac{\cos\left(\alpha\right)}{1 + cos(\alpha)} \right)^2 = \left( \dfrac{2\ \cos(\alpha) + 1}{(\cos(\alpha) + 1)^2} \right) \]

\[ B = \left(\begin{array}{ccc} 0, & \dfrac{r}{2} - \dfrac{r\ \cos\left(\alpha\right)}{1 + cos(\alpha)}, & \dfrac{ \pm\ r\,\sqrt{2\,\cos\left(\alpha \right)+1}}{\cos\left(\alpha \right)+1} \end{array}\right) \]

The square root in the z coordinate of B creates the following restriction: \[ 2\ \cos(\alpha) + 1 \geq 0 \implies -\dfrac{2 \pi}{3} \leq \alpha \leq \dfrac{2 \pi}{3} \]

But we have to avoid zero denominators in the y coordinate of B, so the domain of \( \alpha \) becomes:

\[ -\dfrac{2 \pi}{3} < \alpha < \dfrac{2 \pi}{3} \]

The oloid is a ruled surface generated by the AB segments, by the following equation, where v is between 0 and 1.

\[ A + v\ \overrightarrow{\rm AB} \]

\[ \overrightarrow{\rm AB} = \left(\begin{array}{ccc} -r\,\sin\left(\alpha \right), & \dfrac{r}{2}+r\,\cos\left(\alpha \right)-\dfrac{r\,\left(\cos\left(\alpha \right)-1\right)}{2\,\left(\cos\left(\alpha \right)+1\right)}, & \dfrac{\pm\ r\,\sqrt{2\,\cos\left(\alpha \right)+1}}{\cos\left(\alpha \right)+1} \end{array}\right) \]

\[ \overrightarrow{\rm AB} = \left(\begin{array}{ccc} -r\,\sin\left(\alpha \right), & \dfrac{r\,\left({\cos\left(\alpha \right)}^2+\cos\left(\alpha \right)+1\right)}{\cos\left(\alpha \right)+1}, & \dfrac{ \pm\ r\,\sqrt{2\,\cos\left(\alpha \right)+1}}{\cos\left(\alpha \right)+1} \end{array}\right) \]

\[ A + v\ \overrightarrow{\rm AB} = \left(\begin{array}{ccc} -r\,\sin\left(\alpha \right)\,\left(v-1\right), & \dfrac{r\,\left(2\,v-3\,\cos\left(\alpha \right)-2\,{\cos\left(\alpha \right)}^2+2\,v\,\cos\left(\alpha \right)+2\,v\,{\cos\left(\alpha \right)}^2-1\right)}{2\,\left(\cos\left(\alpha \right)+1\right)}, & \dfrac{\pm\ r\,v\,\sqrt{2\,\cos\left(\alpha \right)+1}}{\cos\left(\alpha \right)+1} \end{array}\right) \]

\[ 0 \leq v \leq 1,\ -\dfrac{2 \pi}{3} < \alpha < \dfrac{2 \pi}{3} \]

\[ |\overrightarrow{\rm AB}|^2 = r^2\ \sin(\alpha)^2 + \dfrac{r^2\,\left({\cos\left(\alpha \right)}^2+\cos\left(\alpha \right)+1\right)^2}{(\cos\left(\alpha \right)+1)^2} + \dfrac{r^2\,(2\,\cos\left(\alpha \right)+1)}{(\cos\left(\alpha \right)+1)^2} \] \[ |\overrightarrow{\rm AB}|^2 = r^2\,\left(1 -{\cos\left(\alpha \right)}^2+\frac{{\left({\cos\left(\alpha \right)}^2+\cos\left(\alpha \right)+1\right)}^2}{{\left(\cos\left(\alpha \right)+1\right)}^2} + \frac{2\,\cos\left(\alpha \right)+1}{{\left(\cos\left(\alpha \right)+1\right)}^2}\right) \] \[ t = \cos(\alpha) \] \[ |\overrightarrow{\rm AB}|^2 = r^2\,\left(1-t^2+\frac{{\left(t^2+t+1\right)}^2}{{\left(t+1\right)}^2}+\frac{2\,t+1}{{\left(t+1\right)}^2}\right) \] \[ |\overrightarrow{\rm AB}|^2 = r^2\,\left( \dfrac{(1-t^2)(t+1)^2+(t^2+t+1)^2+(2t+1)}{(t+1)^2} \right) \] \[ |\overrightarrow{\rm AB}|^2 = r^2\,\left( \dfrac{3t^2 + 6t + 3}{(t+1)^2} \right) \] \[ |\overrightarrow{\rm AB}|^2 = r^2\,\left( \dfrac{3\ (t+1)^2}{(t+1)^2} \right) \] \[ |\overrightarrow{\rm AB}|^2 = 3r^2 \]

\[ |\overrightarrow{\rm AB}| = \sqrt3 \ r \]

Due its a ruled surface, area can be computed by the following formula (see this publication by J. B. Reynolds):

\[ \frac{d \ \overrightarrow{OB}}{d \ \alpha} = \left(\begin{array}{ccc} 0, & \dfrac{r\,\sin\left(\alpha \right)}{{\left(\cos\left(\alpha \right)+1\right)}^2}, & \dfrac{\pm\ r\,\sin\left(2\,\alpha \right)}{2\,{\left(\cos\left(\alpha \right)+1\right)}^2\,\sqrt{2\,\cos\left(\alpha \right)+1}} \end{array}\right) \]

\[ \frac{d \ \overrightarrow{OA}}{d \ \alpha} = \left(\begin{array}{ccc} r\,\cos\left(\alpha \right), & r\,\sin\left(\alpha \right), & 0 \end{array}\right) \]

Since well continue with the positive value of the 3rd coordinate of the derivative of OB, we are computing the oloids top surface. Then, in order to have the total surface, well have to multiply by 2 the result of the integral.

\[ (1-v)\ \frac{d \ \overrightarrow{OB}}{\alpha} + v\ \frac{d \ \overrightarrow{OA}}{\alpha} = \left(\begin{array}{ccc} r\,v\,\cos\left(\alpha \right), & \dfrac{r\,\sin\left(\alpha \right)\,\left(v\,{\cos\left(\alpha \right)}^2+2\,v\,\cos\left(\alpha \right)+1\right)}{{\left(\cos\left(\alpha \right)+1\right)}^2}, & -\dfrac{r\,\sin\left(2\,\alpha \right)\,\left(v-1\right)}{2\,{\left(\cos\left(\alpha \right)+1\right)}^2\,\sqrt{2\,\cos\left(\alpha \right)+1}} \end{array}\right) \]

\[ \overrightarrow{\rm AB} \times \left((1-v)\ \frac{d \ \overrightarrow{OB}}{\alpha} + v\ \frac{d \ \overrightarrow{OA}}{\alpha} \right) = \] \[ = \left(\begin{array}{ccc} -\dfrac{r^2\,\sin\left(\alpha \right)\,\left(3\,v\,\cos\left(\alpha \right)-\cos\left(\alpha \right)+1\right)}{\left(\cos\left(\alpha \right)+1\right)\,\sqrt{2\,\cos\left(\alpha \right)+1}}, & \dfrac{r^2\,\cos\left(\alpha \right)\,\left(3\,v\,\cos\left(\alpha \right)-\cos\left(\alpha \right)+1\right)}{\left(\cos\left(\alpha \right)+1\right)\,\sqrt{2\,\cos\left(\alpha \right)+1}}, & -\dfrac{r^2\,\left(3\,v\,\cos\left(\alpha \right)-\cos\left(\alpha \right)+1\right)}{\cos\left(\alpha \right)+1} \end{array}\right) \]

\[ \left| \overrightarrow{\rm AB} \times \left((1-v)\ \frac{d \ \overrightarrow{OB}}{\alpha} + v\ \frac{d \ \overrightarrow{OA}}{\alpha} \right) \right|^2 = \frac{2\,r^4\,{\left(3\,v\,\cos\left(\alpha \right)-\cos\left(\alpha \right)+1\right)}^2}{2\,{\cos\left(\alpha \right)}^2+3\,\cos\left(\alpha \right)+1} \]

\[ A =2 \ \sqrt{2} \ r^2 \int_{-2\pi/3}^{2\pi/3} \frac{\frac{1}{2} \cos{\alpha} + 1}{\sqrt{2\,{\cos\left(\alpha \right)}^2+3\,\cos\left(\alpha \right)+1}} \,d\alpha\ \]

\[ A =\left. 2 \ \sqrt{2} \ r^2 \ \dfrac{\cos\left(\dfrac{\alpha}{2}\right) \sqrt{2\ \cos(\alpha) + 1} \left( \sin^{-1}\left( \dfrac{2 \sin\left( \dfrac{\alpha}{2}\right)}{\sqrt{3}}\right) + \tan^{-1} \left( \dfrac{\sin\left(\dfrac{\alpha}{2} \right)}{\sqrt{2\ \cos{\alpha} + 1} } \right)\right)}{\sqrt{2\,{\cos\left(\alpha \right)}^2+3\,\cos\left(\alpha \right)+1}} \right|_{-2\pi/3}^{2\pi/3} \]

\[ A =\left. 2 \ \sqrt{2} \ r^2 \ \dfrac{\cos\left(\dfrac{\alpha}{2}\right) \sqrt{2\ \cos(\alpha) + 1} \left( \sin^{-1}\left( \dfrac{2 \sin\left( \dfrac{\alpha}{2}\right)}{\sqrt{3}}\right) + \tan^{-1} \left( \dfrac{\sin\left(\dfrac{\alpha}{2} \right)}{\sqrt{2\ \cos{\alpha} + 1} } \right)\right)}{\sqrt{(2\ \cos(\alpha) + 1)(\cos(\alpha) + 1)}} \right|_{-2\pi/3}^{2\pi/3} \]

\[ A =\left. 2 \ \sqrt{2} \ r^2 \ \dfrac{\cos\left(\dfrac{\alpha}{2}\right) \left( \sin^{-1}\left( \dfrac{2 \sin\left( \dfrac{\alpha}{2}\right)}{\sqrt{3}}\right) + \tan^{-1} \left( \dfrac{\sin\left(\dfrac{\alpha}{2} \right)}{\sqrt{2\ \cos{\alpha} + 1} } \right)\right)}{\sqrt{\cos(\alpha) + 1}} \right|_{-2\pi/3}^{2\pi/3} \]

\[ A =2 \ \sqrt{2} \ r^2 \left( \dfrac{\sqrt{2} \pi}{2} + \dfrac{\sqrt{2} \pi}{2} \right) = 2 \ \sqrt{2} \ r^2 \left( \sqrt{2}\ \pi \right) = 4\ \pi \ r^2 \]

Which is the same area as the sphere.

You can find a parametrized oloid in this Geogebra link.

]]>On the other hand, being able to make API calls and process the response provides a new world of endless possibilities. Nowadays many companies give access to their data via certain endpoints.

Why not put these 2 tools together? In this article well explain how to do it.

There are 2 main ways in Excel to do it:

- Via Visual Basic script
- Via making a query from the data menu

The first thing is enable the developer menu. This can be done in File Options Customize Ribbon:

Once this is done we have to open the VBA editor.

In order to process the JSON response of the API call, we need to add the JsonConverter module, which can be found in the following url: https://github.com/VBA-tools/VBA-JSON/releases, then import JsonConverter.bas into the project. In the VBA Editor, go to File Import.

Then we also need to import 2 references into the project from the Tools menu.

- Microsoft XML, v6.0
- Microsoft Scripting Runtime

Next we have to create a new module to write the code that will make the api call. Here I present 2 examples:

- Get the users from https://jsonplaceholder.typicode.com/

- Get the people from the Star Wars API (https://swapi.dev/).

If you want to save the excel file, remember to use the xlsm extension, which allows macros.

Excel 2016 has a built-in feature that allows to make API calls. Previous versions can also make it, but installing the PowerQuery plugin. To make an API call we must go to the Data tab and click on New Query From Other Sources From Web.

Then we click on Advanced. Here we put the url, and if credentials are needed, they can be entered as a header.

Hope it was helpful!

]]>