Tuesday, May 29, 2018

ASP.Net bundling & minification via CDN Caching

It is normal for ASP.Net applications to use bundles to reduce http calls. It all works till the time CDN comes into the landscape. CDN (Content Delivery Network) uses a special approach to deliver contents from world wide distributed servers which we can envision as served from the edge of the network. That way so many requests get served from CDN server without coming to our web server.  It is nice theory about CDN. What is between ASP.Net bundling and CDN?

CDN normally cache static assets. Out of the box, they cache well known static assets such as html, css, js files. Yeah, image file formats too, if we configure. If we had bundled static assets using ASP.Net bundling mechanism which gets a different URL than just js files, those will not be seen by CDN as static assets. Hence no caching.

This is applicable if CDN is placed in between. Akamai provides such a service where we don't need to change our code when we introduce CDN. CDN servers cache when they get content from the backend web server. Normally when we introduce CDN, the domain has to change to serve from CDN. That has added benefit too since it increase the parallel request limit to same origin. If we want to integrate CDN via separate domain, better forget about ASP.Net bundling.

Here lets see how to make pass through CDNs work with ASP.Net bundling since they don't introduce new CDN URL.

How can we make CDN friend with ASP.Net bundles

Approach 1 - Don't use bundling

Sounds easy but if the ASP.Net application is designed as product which is hosted by individual customer, every consumer may not have CDN capability. So need bundling. For hosted applications ie there is only one instance of application in the world, yes we can try this. But still without http2, it would cause performance degradation as previous http versions make different connections for each file / resource request. If the application has angular or any other SPA frameworks used, the number of files may be large.

Approach 2 - Have CDN cache the ASP.Net bundle URL

When we bundle, ASP.Net provides a new URL instead of all the files inside the bundle. From browser we will see only one bundle URL request instead of individual resources in the bundle. That URL has a predefined format. If the CDN supports custom RegEx kind of way to specify the URL formats to cache, it works without much efforts.

The RegEx for recognizing ASP.Net bundle URL format goes as follows.

/\?v=[a-zA-Z0-9_-]{44}$/

This works only if there are no other API or dynamic resource URL ending with ?v={44 letter value}

Approach 3 - Embrace SPA with JS modules & forget server side rendering 

Instead of server side rendering done by ASP.Net, embrace the industry trend SPA (Singe Page Application) with JavaScript module system such as Webpack. Webpack can produce static bundles which can be referred in application as a normal file.

Recommended

If there is enough budget and time, embrace SPA else use the RegEx and stay with ASP.Net bundling and CDN.

I don't have any association with Akamai except using in projects. This post is not intended to promote any of their products.

Tuesday, May 15, 2018

Azure @ Enterprise - Finding how many nodes are really created for one HDInsight cluster

When we create an Azure HDICluster, it internally creates virtual machines. In the Azure portal's cluster creation blade, it asks for the details about Head and Worker nodes. We cannot set the no of head nodes but worker nodes. All good till now.

But @ enterprise, if the HDInsight cluster need to be in vNet, there could be issues on lack of IP Addresses available in the subnet. Its gets worse if the creation needs to happen dynamically in a multi tenant application. It is very difficult to do calculation on the IP address requirements of HDICluster, if we don't know the internals of how many VMs get created as part of one HDInsight cluster regardless of worker nodes.

Is that not available publicly? Yes it is and below are links towards it.
https://blogs.msdn.microsoft.com/azuredatalake/2017/03/10/nodes-in-hdinsight/
https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-port-settings-for-services

The above tells for Spark it create Head nodes, ZooKeeper nodes and Gateway nodes. How to validate how many machines created or how to verify the facts ourselves. The portal never tells how many machines get created, if we navigate to already created HDICluster resource blade. PowerShell object of HDICluster instance too doesn't have direct info about the internal machines created. So what is the alternative?

PowerShell to retrieve nodes

Again PowerShell and some string comparisons to rescue. Below goes the script.

$hdiClusterName = "<name of cluster without domain>"

"Assumption 1 - The vNet and subnet of all nodes are same."
"Assumption 2 - The vNet, Public IPAddresses & NIC are in same resource group"
"Assumption 3 - There will be a gateway nodes fof HDICluster and public ip address for gateway is in format publicIpgateway-<internal id>"
"Assumption 4 - A unique internal id is used to name the nodes,NICs, public addresses etc...This script heavily depend on that internal id based naming convention"

"--------------------------------------------------------"

$resource =(Get-AzureRmResource -ResourceId (Get-AzureRmHDInsightCluster -clustername $hdiClusterName).Id)

$hdiClustersVNetResourceGroupName = (Get-AzureRmResource -ResourceId $resource.Properties.computeProfile.roles[0].virtualNetworkProfile.id).ResourceGroupName

"ResourceGroup of vNet assiciated with HDI cluster- $hdiClustersVNetResourceGroupName"

$publicAddress = (Get-AzureRmPublicIpAddress -ResourceGroupName $hdiClustersVNetResourceGroupName) | Where-Object {$_.DnsSettings.DomainNameLabel -eq $hdiClusterName}

$publicIpgatewayName = $publicAddress.Name

$hdiClusterInternalId = $publicIpgatewayName.Split('-')[1]

"Internal Id of HDI used to create nodes - $hdiClusterInternalId"

"Below are the NICs used by $hdiClusterName HDI Cluster. Each NIC corresponds to one node."

$nics = Get-AzureRmNetworkInterface -ResourceGroupName $hdiClustersVNetResourceGroupName
$nics = $nics | Where-Object {$_.Name -like "*$hdiClusterInternalId"}
$nics | Select-Object -Property Name

As we can see the script relies on the naming convention of NICs. If Microsoft changes it the script will fail.

From the list we can see it creates 2 Head nodes, 3 ZooKeeper and 2 Gateway nodes along with minimum 1 worker node. Minimum 8 IP Addresses will be consumed for one HDInsight cluster. At the time of writing this post the ZooKeeper and Gateway nodes seems free. The charge is only for Head and Worker node(s)

Ambari Portal

Another way is via Ambai portal. If we navigate to the below URL, we can see the head nodes and ZooKeeper nodes. But not able to see the gateway nodes.

https://<cluster name>.azurehdinsight.net/#/main/hosts

Happy scripting...

Tuesday, May 8, 2018

Azure @ Enterprise - Checking connectivity from AppServiceEnvironment to HDInsight

The background here is Enterprise Azure environment where most of the things are in vNet and their own subnets. When the Spark HDInsights are in separate Subnet other than the application, there will not be connectivity by default, if we need to submit jobs via Livy or anything like that (This again depends on the Enterprise policy). We have to open the routes from application subnet to HDInsight subnet. The route opening depends on how the infrastructure is laid out. If there are no firewalls of proxies in between application and HDInsight clusters, simple NSG rules would be sufficient.

Suppose there are 2 teams involved one is infrastructure and other development or QA teams, how can the development or QA can verify there is connectivity?

If the application is hosted in virtual machines, we can just log in and open the Ambari UI. Even we can run network troubleshooting commands. But what to do if the applications are hosted as AppService WebApps? If the applications are not client facing and need to be secured from neighbors, those may be inside their own AppServiceEnvironments. Basically no user interface available.

The solution is simple. Back to command line mode and somehow check the http connectivity to the HDICluster. Below is one powershell command which we can execute from the command line interface exposed from Kudu.

curl -Credential "user name" -Method "GET" -Uri "https://<cluster name>.azurehdinsight.net/livy/batches"

How to reach to the kudu console of an AppService instance is detailed in the below links.
https://blogs.msdn.microsoft.com/benjaminperkins/2017/11/08/how-to-access-kudu-scm-for-an-azure-app-service-environment-ase/
https://blogs.msdn.microsoft.com/benjaminperkins/2014/03/24/using-kudu-with-windows-azure-web-sites/

The command tested for non domain joined HDInsight clusters. When we enter the above command it will ask for the password interactively. 

This is just a manual command to test the connectivity. If the scenario is multi-tenant and want to ensure connectivity from application, use WebClient or similar methods.

Update 1 @ 24Jan2019

If the below error occur when leveraging the same technique for web app to web app communication testing, try to force the connection via TLS 1.2

invoke-RestMethod : The underlying connection was closed: An unexpected error
PS D:\home>
occurred on a send. At line:1 char:1 + invoke-RestMethod -Method "GET" -Uri "https://eastus-sapkeyvaultservi ... + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + CategoryInfo : InvalidOperation: (System.Net.HttpWebRequest:Htt pWebRequest) [Invoke-RestMethod], WebException + FullyQualifiedErrorId : WebCmdletWebResponseException,Microsoft.PowerShe ll.Commands.InvokeRestMethodCommand


TLS 1.2 can be enforced by below code snippet.
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12

Update 2 @ 3Feb2019

Sometimes the below error can be thrown if the PowerShell is not able to format
The response content cannot be parsed because the Internet Explorer engine is not available, or Internet Explorer's first-launch configuration is not complete. Specify the UseBasicParsing parameter and try again

 The solution to this is to add the basic parsing switch.

curl -Credential "user name" -Method "GET" -Uri "https://<cluster name>.azurehdinsight.net/livy/batches" -UseBasicParsing


Tuesday, May 1, 2018

Azure @ Enterprise - Finding Subnet of HDICluster via PowerShell

Enterprise love putting resources into virtual network(vNet) thinking that, it brings certain high level of free security via isolation. HDInsight clusters can also be put into vNet. What to do if we have a HDICluster which is added to Subnet earlier and we don't know the Subnet name now?

Silly question. Isn't it? Just go to portal and see the cluster properties. That is what below MSDN article says.
https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-administer-use-portal-linux#list-and-show-clusters

Unfortunately our corporate subscription is only showing vNet name in the properties not Subnet. So how to get the Subnet name?

Finding Subnet of HDICluster

Powershell help us here. Below goes the script

$hdiClusterName = "POC001Joy"
$hdiCluster = (Get-AzureRmHDInsightCluster -clustername $hdiClusterName)
$resourceProperties =(Get-AzureRmResource -ResourceId (Get-AzureRmHDInsightCluster -clustername $hdiClusterName).Id).properties
$resourceProperties.computeProfile.roles[0].virtualNetworkProfile.subnet

It simply gets the object of HDInsight resource and navigate to the required property. The object model is little confusing but its there.

The above PowerShell script can be directly entered into Azure Portal or run after logging into Azure from PowerShell window / ISE.

An issue has been added to the MSDN documentation article related to this missing subnet name.

Enjoy Scripting...