Huawei Cloud Business Tax ID Verification Setting Up Auto Scaling on Huawei Cloud Accounts

Huawei Cloud / 2026-04-27 21:35:32

Introduction: Auto Scaling, But Make It Less Terrifying

Let’s be honest: “auto scaling” sounds like something you need a wizard for. In reality, it’s more like teaching your infrastructure to respond to weather. When traffic looks like a summer storm rolling in, your system should automatically add more servers. When things cool down, it should scale back to avoid wasting money. Huawei Cloud can do that, and with the right setup on your account, it becomes a pretty civilized experience.

This guide walks you through setting up auto scaling on Huawei Cloud accounts—meaning the process you do across the console, the necessary prerequisites, and the practical choices you’ll need to make along the way. We’ll keep it readable, structured, and full of the “here’s what to click and what to watch out for” energy that usually saves you from the classic problem: “Why did nothing scale when the metric screamed for help?”

What You’re Actually Setting Up (A Quick Mental Model)

Auto scaling is not a single magic switch. It’s a combination of pieces that work together:

Huawei Cloud Business Tax ID Verification Scaling group (or auto scaling group): the set of compute instances that can be scaled.
Scaling policy: the rule for when and how to scale (scale out, scale in).
Alarms / metrics: the trigger signal (CPU utilization, request rate, queue length, etc.).
Instance configuration: what your instances should look like when they’re created (image, flavor, network settings).
Limits and cooldown: safety rails so you don’t spawn 400 servers because a single metric blinked.

Think of it like ordering takeout with instructions: “If my food is taking too long (alarm), add more chefs (scale out). If everything is done and quiet (alarm), reduce the kitchen staff (scale in). Also, please don’t fire the chef because the microwave beeps once.” The cooldown is your “don’t overreact” rule.

Prerequisites: Before You Touch Auto Scaling

Before you create anything, confirm your account has what it needs. Nothing ruins momentum like discovering you have no quotas or your network is missing.

1) Check Your Huawei Cloud Account Permissions

Make sure your user has permissions to create and manage resources. You typically need authorization for services involved in auto scaling and the compute instances (ECS).

If your team uses IAM roles, verify that your role includes actions related to:

Auto Scaling (and related orchestration components, depending on the region/service setup)
Huawei Cloud Business Tax ID Verification ECS instance creation and termination
VPC networking components (if required)

Tip: If you can’t find the menu item for auto scaling in the console, it’s often permission-related rather than a “product not supported” situation.

2) Quota and Resource Availability

Auto scaling is only as powerful as your ability to provision resources. Check quotas for:

ECS cores / instances
IP addresses (depending on network design)
Volume quotas (if you attach data disks)

If your quota is low, scaling might pause at the “we tried, but we can’t” stage. That’s the infrastructure version of trying to open the door and realizing the handle isn’t there.

3) Networking Basics (VPC, Subnets, Security Groups)

You’ll usually need a VPC, subnets, and security group rules that allow traffic to your instances. Ensure you have:

A VPC network
At least one subnet in the desired availability zones
Security group rules allowing inbound traffic (HTTP/HTTPS, SSH, etc.)
Proper routing (default route and any needed gateways)

Auto scaling will create instances in the configured networks. If your networking isn’t ready, the instances will be born into a sad world where they can’t serve requests.

4) Load Balancer Considerations (Highly Recommended)

If your service is user-facing, a load balancer (LB) is strongly recommended. Auto scaling works best when traffic can be distributed across instances as the group size changes.

Without a load balancer, you can still scale based on metrics, but your users might not reach the new instances unless you implement a routing layer yourself.

Step-by-Step: Setting Up Auto Scaling on Huawei Cloud Accounts

Now we get into the practical setup. Exact menu wording may vary slightly by console updates, but the concepts remain consistent.

Step 1: Create or Select an ECS Instance Template (What Instances Look Like)

Most auto scaling setups require a template or configuration describing how new instances should be created.

Start by deciding:

Operating system image (and region-availability)
Instance flavor (CPU/RAM sizing)
Network interface configuration
Security groups
Storage (system disk and optional data disks)
User data / initialization scripts (optional but often useful)

Initialization scripts can install your application, pull configuration, register services, etc. If you don’t do this, the new instances may join the party but not start the service—like bringing extra chairs to a meeting that nobody scheduled.

Pro Tip: Design for Statelessness

To scale in and out safely, keep instances as stateless as possible. If you need state (sessions, uploads, etc.), store it in external systems like databases, object storage, caches, or shared storage solutions.

This helps scaling in: when instances terminate, you won’t lose important data living only on ephemeral disks.

Huawei Cloud Business Tax ID Verification Step 2: Configure an Auto Scaling Group

In the auto scaling console, create an auto scaling group and provide:

Group name: descriptive, so Future You can find it quickly.
VPC and subnets: where instances will be created.
Instance configuration/template: what to launch.
Desired capacity: initial number of instances.
Minimum capacity: the lower bound.
Huawei Cloud Business Tax ID Verification Maximum capacity: the upper bound.

Example starting values (adjust to your needs):

Minimum: 2
Desired: 2 or 3
Maximum: 10

Why this matters: if you set min and max too narrowly, you’ll either never scale up enough or waste money scaling too aggressively.

Consider Availability Zones

If the console lets you select multiple zones, choose enough to improve fault tolerance. A single-zone scaling group can be fine for dev environments, but production setups generally benefit from multi-zone design.

Huawei Cloud Business Tax ID Verification Step 3: Attach Your Load Balancer (If You Have One)

If you’re using an LB, you typically configure the auto scaling group to register instances to a target group or backend pool. This ensures traffic is routed to newly created instances automatically.

Check these details:

Health check settings (path, port, thresholds)
Listener ports (HTTP/HTTPS)
Target group membership behavior

Health checks are critical. If your health check expects something your app doesn’t return (like a custom endpoint that’s not ready), instances will be considered unhealthy and effectively ignored by the load balancer.

Step 4: Define Scaling Policies (How the Group Changes)

Now you define the rules for scaling. You’ll typically see different policy types such as:

Simple scaling: fixed step changes (add/remove N instances).
Step scaling: changes based on how far the metric deviates.
Scheduled scaling: time-based scaling (optional, not metric-driven).

For most teams starting out, simple or step scaling is usually enough.

Scaling Out Rule Example

Say you want to scale out when CPU utilization rises above 70% for a sustained period.

Metric: CPU utilization
Huawei Cloud Business Tax ID Verification Threshold: 70%
Evaluation period: e.g., 3 consecutive periods
Action: increase by 1 instance (step or simple scaling)

For scaling out, be mindful of “scale stampedes.” If you only allow one alarm and cooldown is short, you can end up with repeated scale out actions. That’s why we configure cooldown and max limits.

Scaling In Rule Example

For scale in, you may trigger when CPU falls below, say, 30%.

Metric: CPU utilization
Threshold: 30%
Evaluation period: e.g., 5 consecutive periods
Action: decrease by 1 instance (down to min capacity)

Scaling in too quickly can cause instability (rapid up/down). Many setups scale in more cautiously than scaling out.

Step 5: Configure Cooldown, Warm-up, and Safety Limits

Cooldown is the auto scaler’s version of “give me a minute.” After it triggers a scaling action, it waits before considering another scaling action. This helps avoid oscillation.

Set cooldown values based on:

Instance startup time
Application readiness time
Load balancer health check time

If your instances take 2 minutes to boot and become healthy, a cooldown of 30 seconds is like expecting the new chef to cook instantly. You’ll see weird behavior: alarms keep firing while the new instance is still starting.

Also, confirm that your min/max limits match business needs. The autoscaler will not exceed max even if the metric is on fire.

Step 6: Create Alarms / Bind Metric Triggers

Auto scaling typically relies on monitoring metrics from Huawei Cloud Monitoring services. You’ll define alarms tied to those metrics.

Common metrics to start with:

CPU utilization
Memory utilization (if available)
Network in/out
Application-level metrics (request rate, error rate) if you push custom metrics

In many real systems, CPU alone is a decent starting point but not always the best long-term choice. For example, CPU can be low while request queue builds up due to slow downstream dependencies. If you can, scale on something closer to user experience (like request latency or queue depth).

Metric Granularity and Evaluation Period

Take care with how often metrics are sampled and how long you require the threshold to be exceeded. A short evaluation period makes scaling react quickly but may respond to noise. A longer evaluation period is calmer but might react too late.

A practical approach:

Start with 1–3 evaluation periods for scale out
Use slightly longer evaluation for scale in
Tune after observing behavior under load

Step 7: Validate with a Controlled Test

You don’t have to wait for “real traffic” to test. You can simulate load and verify scaling behavior.

Design Your Test

Start with desired capacity (e.g., 2 instances)
Generate load to push CPU/request metrics above your threshold
Observe scaling actions and instance provisioning time
Reduce load to trigger scale in behavior

During test, check:

Auto scaling event logs (to confirm policies were triggered)
Instance lifecycle events (created, joining, healthy, terminated)
Load balancer target health status
Monitoring graphs for metrics and alarm state

If scaling doesn’t happen, don’t immediately blame the auto scaler. The issue is usually one of these classics:

Metric isn’t available or isn’t the one you think it is
Alarm threshold is wrong or units mismatch
Cooldown prevents additional actions
Max capacity is already reached
Permissions or quotas block instance creation

Troubleshooting: The “Why Didn’t It Scale?” Checklist

Let’s save you time. Here’s a checklist you can run through quickly.

1) The Alarm Never Fired

Confirm the metric name and scope match your instances (some setups differ by dimension).
Check metric units (percentage vs. absolute values).
Verify evaluation period and trigger conditions.

2) Alarm Fired, But No Instances Were Added

Confirm max capacity wasn’t reached.
Check quota limits for ECS resources.
Check IAM permissions for scaling actions.
Confirm the scaling policy is correctly attached to the group.

3) Instances Were Created, But They Didn’t Serve Traffic

Check user-data scripts and application startup logs.
Verify load balancer health checks (endpoint, port, expected status code).
Confirm security group rules allow traffic from the LB.

4) Scaling “Oscillates” (Scale Out/In Repeatedly)

Increase cooldown.
Add hysteresis: use different thresholds for scale out and scale in (e.g., 70% out, 40% in).
Increase evaluation periods to smooth noise.

Cost Control: Avoid Turning Auto Scaling Into a Spending Machine

Auto scaling is cost-aware only if you set it up sensibly. Otherwise, it can become the infrastructure equivalent of ordering the biggest combo meal every time you feel a little snacky.

To keep costs under control:

Set a realistic maximum capacity.
Use scaling steps rather than scaling huge leaps.
Consider more stable metrics (e.g., request queue length) if CPU is noisy for your workload.
Tune cooldown based on real instance warm-up time.
Review scaling events during a test phase.

Recommended Production Configuration Patterns

If you want a “works most of the time” approach, consider these patterns.

Pattern A: CPU-Based Scaling for Web Services

Use CPU utilization with conservative evaluation periods. Ensure your application is horizontally scalable and stateless. Combine with an LB for seamless traffic distribution.

Pattern B: Queue/Latency-Based Scaling for Async Workloads

If you process jobs, scale based on queue depth or processing backlog. If you can track request latency, scaling on latency can be more user-centric than CPU.

Pattern C: Scheduled Scaling for Predictable Traffic

For workloads that spike at known times (e.g., business hours), scheduled scaling can reduce reaction time and cost. Combine it carefully with metric-based policies so you still handle unexpected surges.

Security and Operational Hygiene

Auto scaling doesn’t remove your need for security and operations. In fact, it increases the number of instances you may manage, so hygiene matters more.

Use Least Privilege IAM

When creating permissions for auto scaling, follow least privilege. Avoid giving broad admin rights if a narrower role suffices.

Make Instances Self-Contained (But Not Chaotic)

Your initialization scripts should be idempotent where possible and should fail loudly in logs. A new instance that partially configures itself but never starts correctly is a reliable way to ruin your demo day.

Huawei Cloud Business Tax ID Verification Log and Monitor Scaling Events

Ensure you can correlate:

Scaling policy triggers
Alarm state changes
Instance lifecycle and health check results

This makes debugging dramatically faster and prevents “mystery outages” that appear to have happened from pure vibes.

Common Mistakes (So You Can Skip Them)

Setting thresholds too aggressively: you end up scaling on noise.
Ignoring warm-up time: cooldown doesn’t match your boot time.
Forgetting load balancer health checks: instances scale but never receive traffic.
Choosing max capacity without thinking: cost risk or resource quota issues.
Using CPU as the only metric forever: sometimes it’s not the right signal.

Huawei Cloud Business Tax ID Verification A Mini Walkthrough Example (Putting It All Together)

Imagine you run an API service behind a load balancer. You expect occasional spikes. You want auto scaling to handle them.

You decide:

Min instances: 2
Max instances: 8
Scale out when CPU > 70% for 3 periods
Scale in when CPU < 40% for 5 periods
Cooldown: 300 seconds

Huawei Cloud Business Tax ID Verification In the auto scaling group:

You select your ECS template (including app startup scripts).
You connect the group to the LB target group.
You create scaling policies for both out and in.
You bind alarms to those policies with the monitoring service.

During test load:

CPU rises above 70%
Alarm triggers
Auto scaling increases desired instances by 1 step
New instance boots, passes health checks, and begins receiving traffic
When CPU drops below 40%, the group scales back down gradually

If something fails, you follow the troubleshooting checklist. Usually it’s either health checks, cooldown mismatch, or incorrect metric selection.

How to Keep Tuning After the First Deployment

Auto scaling isn’t “set and forget” the way you might hope. It becomes better as you observe real behavior.

Review scaling history for the last 1–2 weeks.
Check if scaling happens too late or too early.
Compare alarm triggers to real performance metrics (latency, errors).
Adjust thresholds and cooldown based on observed warm-up times.

And if you see frequent scale events, consider whether your thresholds are too tight or your workloads are more spiky than you expected.

Conclusion: You Now Have an Infrastructure That Pays Attention

Setting up auto scaling on Huawei Cloud accounts isn’t magic, but it does reward careful setup. Start with the fundamentals: templates, networking, correct permissions, and realistic quotas. Then define scaling policies tied to meaningful metrics, with sensible cooldown and safety limits. Finally, validate under load and iterate based on observed behavior.

Once your setup is tuned, you’ll get two wins at the same time: your users experience smoother performance during spikes, and your costs stay under control during quieter periods. Auto scaling isn’t just about adding servers—it’s about making your system responsive and resilient without micromanaging every traffic fluctuation.

If you’d like, tell me your workload type (web API, batch jobs, message queue consumers) and what metrics you currently have. I can suggest a practical starting policy set (thresholds, evaluation periods, and scaling steps) that matches your use case.